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Complete Genome Sequence of the Methanogenic Archaeon, 

Methanococcus jannaschii 

Background of the Invention 

Statement as to Rights to Inventions Made Under 
F ederally-Spomored Research and Development 

Part of the work performed during development of this invention utilized 
U.S. Government funds. The U.S. Government may have certain rights in the 
invention - DE-FC02-95ER61962; DE-FC02-95ER61963; and NAGW 2554. 

Field of the Invention 

The present application discloses the complete 1 .66-megabase pair 
genome sequence of an autotrophic archaeon, Methanococcus jannaschii, and its 
58- and 16-kilobase pair extrachromosomal elements. Also identified are 1738 
predicted protein-coding genes. 

Related Background Art 

The view of evolution in which all cellular organisms are in the first 
instance either prokaryotic or eukaiyotic was challenged in 1977 by the finding 
that on the molecular level life comprises three primary groupings (Fox, G.E., et 
al, Proc. Natl. Acad. Sci. USA 74:4537 (1977); Woese, C.R. & Fox, G.E., Proc. 
Natl. Acad Sci. USA 7^:5088 (1977); Woese, C.R., et aL, Proc. Natl.,Acad Sci. 
USA 87:4576 (1990)): the eukaryotes (Eukarya) and two unrelated groups of 
prokaryotes, Bacteria and a new group now called the Archaea. Although 
Bacteria and Archaea are both prokaryotes in a cytological sense, they differ 
profoundly in their molecular makeup (Fox, G.E., ei al., Proc. Natl. Acad. Sci. 
USA 74:4537 (1977); Woese, C.R. & Fox, G.E., Proc. Natl Acad Sci. USA 
7^:5088 (1977); Woese, C.R., etal, Proc. Natl.,Acad Sci. USA 87:4576 (1990)). 
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Several lines of molecular evidence even suggest a specific relationship between 
Archaea and Eukarya (Ivc^abe, N., et aL, Proc. Natl Acad. Sci. USA 56:9355 
(1989); Gogarten J.P., et ai, Proc. Natl Acad ScL USA 86:6661 (1989); Brovm, 
J.R. and Doolittle, W.F., Proc, Natl Acad ScL USA P2:2441 (1995)), 

The era of true comparative genomics has been ushered in by complete 
genome sequencing and analysis. We recently described the first two complete 
bacterial genome sequences, those of Haemophilus influenzae and Mycoplasma 
genitalium (Fleischmann, R,D,, et al.. Science 269:496 (1995); Fraser, CM., et 
aL, Science 270:391 (1995)). Large scale DNA sequencing efforts also have 
produced an extensive collection of sequence data from eukaryotes, including 
Homo sapiens (Adams, M.D., et al.^ Nature 377:3 (1995)) and Saccharomyces 
cerevisiae (Levy, J., Yeast 70:1689 (1994)). 

M Jannaschii was originally isolated by J.A. Leigh from a sediment 
sample collected fi*om the sea floor surface at the base of a 2600 m deep "white 
smoker" chimney located at 21** N on the East Pacific Rise (Jones, W., et al.^ 
Arch. Microbiol 136:254 (1983)). M jannaschii grows at pressures of up to 
more than 500 atm and over a temperature range of 48-94 °C, with an optimum 
temperature near 85 ^'C (Jones, W., et al. Arch. Microbiol 136:254 (1983)). The 
organism is autotrophic and a strict anaerobe; and, as the name implies, it 
produces methane. The dearth of archaeal nucleotide sequence data has 
hampered attempts to begin constructing a comprehensive comparative 
evolutionary framework for assessing the molecular basis of the origin and 
diversification of cellular life. 

Summary of the Invention 

The present invention is based on whole-genome random sequencing of 
an autotrophic archaeon, Methanococcus jannaschii. The M jannaschii genome 
consists of three physically distinct elements: (i) a large circular chromosome; (ii) 
a large circular extrachromosomal element (ECE); and (iii) a small circular 
extrachromosomal element (ECE). The nucleotide sequences generated, the A/. 
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jannaschii chromosome, the large ECE, and the small ECE, are respectively 
provided on pages 1 52-585 (SEQ ID NO: 1 ), pages 585-600 (SEQ ID NO:2), and 
pages 601-605 (SEQ ID NO:3). 

The present invention is further directed to isolated nucleic acid molecules 
comprising open reading frames (ORFs) encoding M.jannaschii proteins. The 
present invention also relates to variants of the nucleic acid molecules of the 
present invention, wrhich encode portions, analogs or derivatives of M jannaschii 
proteins. Further embodiments include isolated nucleic acid molecules 
comprising a polynucleotide having a nucleotide sequence at least 90% identical, 
and more preferably at least 95%, 96%, 97%, 98% or 99% identical, to the 
nucleotide sequence of a M jannaschii ORF described herein. 

The present invention also relates to recombinant vectors, which include 
the isolated nucleic acid molecules of the present invention, host cells containing 
the recombinant vectors, as well as methods for making such vectors and host 
cells for M. jannaschii protein production by recombinant techniques. 

The invention further provides isolated polypeptides encoded by the M. 
jannaschii ORFs. It will be recognized that some amino acid sequences of the 
polypeptides described herein can be varied without significant effect on the 
structure or function of the protein. If such differences in sequence are 
contemplated, it should be remembered that there will be critical areas on the 
protein which determine activity. In general, it is possible to replace residues 
which form the tertiary structure, provided that residues performing a similar 
function are used. In other instances, the type of residue may be completely 
unimportant if the alteration occurs at a non-critical region of the protein. 

In another aspect, the invention provides a peptide or polypeptide 
comprising an epitope-bearing portion of a polypeptide of the invention. The 
epitope-bearing portion is an immunogenic or antigenic epitope useful for raising 
antibodies. 
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Brief Description of the Figures 

Figure 1. A schematic showing the relationship of the three domains of 
life based on sequence data from the small subunit of rRNA (Fox, G.E., et aL, 
Proa Natl Acad, ScL USA 74:4537 (1977); Woese, C.R. & Fox, G.E., Proa 
Natl. Acad ScL USA 74:5088 (1977); Woese, C.R., et aL, Proa Natl Acad Sci. 
USA <?7:4576(1990)), 

Figure 2. Structure of a putative family of insertion sequence (IS) 
elements in the M jannaschii genome. The family of elements has been named 
ISAMJI and contains 1 1 members distributed among three groups (A, B, and C). 
The outer rectangle indicates the entire IS element; the interior rectangles indicate 
the predicted coding regions, oriented with the NH2-tennini to the left. DNA 
immediately adjacent to the NH2-termini is 75 to 100% identical over 50 bp; 
DNA sequence similarity at the COOH-termini ends immediately after the stop 
codon. Black triangles indicate terminal inverted repeats. Fill patterns indicate 
which regions are missing from the elements in groups B and C. (A) Two copies 
of this family are 642 bp long and are 97% similar to each other at the nucleotide 
level. They appear to encode a protein 214 amino acids in length (ORFs MJ0017 
and MJ1466) that are 27% identical to the IS240 transposase of Bacillus 
thuriengiensis (GenBank Accession number: M23741). (B) Eight copies of the 
family range in length from 358 to 360 bp and are missing a 342-bp internal 
region relative to the two members of group A. Some members of group B have 
putative frameshifts (indicated by solid arrows) and in-frame UGA codons 
(indicated by open arrows). (C) The single copy in group C is 265 bp in length 
and occurs on the large ECE. The 436 bp intemal region missing from this 
element is different than that of the members of group B. 

Figure 3. Structure of a multicopy repetitive element in the M jannaschii 
genome. Of the 1 8 copies identified on the main chromosome, seven are oriented 
in one direction (plus strand) and 1 1 are oriented in the opposite strand. Each 
element consists of a long, 391- to 425-bp repeat segment (designated LR) 
followed by up to 25 short, 27- to 28-bp repeat segments (designated SR), Each 



PCT/US97/14900 

-5- 

SR segment is separated by 31 to 51 bp of sequence that is unique within and 
between each complete repeat element. (A) The longest repeat element has an LR 
segment followed by 25 SR segments, and spans more than 2 kbp, and (B) the 
shortest complete element has an LR segment followed by two SR segments. (C) 
One element is present in the genome with five SR segments and no LR 
component. (D and E) The LR segments of two elements in the genome are 
truncated at the end adjacent to the SR segments, both are followed by a single 
SR segment. 

Figure 4. Block diagram of a computer system 102 that can be used to 
implement the computer-based systems of present invention. 

Detailed Description of the Invention 

The present invention is based on whole-genome random sequencing of 
an autotrophic archaeon, Methanococcus JannaschiL The M. jannaschii genome 
consists of three physically distinct elements: (i) a large circular chromosome of 
1,664,976 base pairs (bp) (shown on pages 1 52-585 and in SEQ ID NO: 1), which 
contains 1682 predicted protein-coding regions and has a G+C content of 3 1 .4%; 
(ii) a large circular extrachromosomal element (the large ECE) of 58,407 bp 
(shown on pages 585-600 and in SEQ ID NO:2), which contains 44 predicted 
protein-coding regions and has a G+C content of 28.2%; and (iii) a small circular 
extrachromosomal element (the small ECE) of 16,550 bp (shown on pages 601- 
605 and in SEQ ID NO:3), which contains 12 predicted protein-coding regions 
and has a G+C content of 28.8%. 

The primary nucleotide sequences generated, the M jannaschii 
chromosome, the large ECE, and the small ECE, are provided in SEQ ID NOs: 1 , 
2, and 3, respectively. As used herein, the "primary sequence" refers to the 
nucleotide sequence represented by the lUPAC nomenclature system. The 
present invention provides the nucleotide sequences of SEQ ID NOs:l, 2, and 3, 
or a representative fragment thereof, in a form which can be readily used, 
analyzed, and interpreted by a skilled artisan. 
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As used herein, a "representative fragment" refers to M jannaschii 
protein-encoding regions (also referred to herein as open reading frames), 
expression modulating fragments, uptake modulating fragments, and fragments 
that can be used to diagnose the presence of M jannaschii in a sample, A non- 
limiting identification of such representative fragments is provided in Tables 2(a) 
and 3. As described in detail below, representative fragments of the present 
invention ftirther include nucleic acid molecules having a nucleotide sequence at 
least 90% identical, preferably at least 95, 96%, 97%, 98%, or 99% identical, to 
an ORF identified in Table 2(a) or 3. 

As indicated above, the nucleotide sequence information provided in SEQ 
ID NOs:l, 2 and 3 w^as obtained by sequencing the M jannaschii genome using 
a megabase shotgun sequencing method. The sequences provided in SEQ ID 
NOs:l, 2 and 3 are highly accurate, although not necessarily a 100% perfect, 
representation of the nucleotide sequence of the M jannaschii genome. As 
discussed in detail below, using the information provided in SEQ ID NOs:l, 2 
and 3 and in Tables 2(a) and 3 together with routine cloning and sequencing 
methods, one of ordinary skill in the art would be able to clone and sequence all 
"representative fragments" of interest including open reading frames (ORFs) 
encoding a large variety of M. jannaschii proteins. In rare instances, this may 
reveal a nucleotide sequence error present in the nucleotide sequences disclosed 
in SEQ ID NOs: I, 2, and 3. Thus, once the present invention is made available 
(i.e., once the information in SEQ ID NOs:l, 2, and 3 and in Tables 2(a) and 3 
have been made available), resolving a rare sequencing error would be well 
within the skill of the art. Nucleotide sequence editing software is publicly 
available. For example, Applied Biosystem's (AB) AutoAssembler'^'^ can be used 
as an aid during visual inspection of nucleotide sequences. 

Even if all of the rare sequencing errors were corrected, it is predicted that 
the resulting nucleotide sequences would still be at least about 99.9% identical 
to the reference nucleotide sequences in SEQ ID NOs:l, 2, and 3. Thus, the 
present invention ftirther provides nucleotide sequences that are at least 99.9% 
identical to the nucleotide sequence of SEQ ID NO: 1 , 2, or 3 in a form which can 
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be readily used, analyzed and interpreted by the skilled artisan. Methods for 
determining whether a nucleotide sequence is at least 99.9% identical to a 
reference nucleotide sequence of the present invention are described below. 

Nucleic Acid Molecules 

The present invention is directed to isolated nucleic acid fragments of the 
M. jannaschii genome. Such fragments include, but are not limited to, nucleic 
acid molecules encoding polypeptides (hereinafter open reading frames (ORFs)), 
nucleic acid molecules that modulate the expression of an operably linked ORF 
(hereinafter expression modulating fragments (EMFs)), nucleic acid molecules 
that mediate the uptake of a linked DNA fragment into a cell (hereinafter uptake 
modulating fragments (UMFs)), and nucleic acid molecules that can be used to 
diagnose the presence of M. jannaschii in a sample (hereinafter diagnostic 
fragments (DFs)). 

By "isolated nucleic acid molecule(s)" is intended a nucleic acid 
molecule, DNA or RNA, that has been removed from its native environment. For 
example, recombinant DNA molecules contained in a vector are considered 
isolated for the purposes of the present invention. Further examples of isolated 
DNA molecules include recombinant DNA molecules maintained in heterologous 
host cells, purified (partially or substantially) DNA molecules in solution, and 
nucleic acid molecules produced synthetically. Isolated RNA molecules include 
in vitro RNA transcripts of the DNA molecules of the present invention. 

In one embodiment, M yownajcA/z DNA can be mechanically sheared to 
produce fragments about 1 5-20 kb in length, which can be used to generate a M. 
jannaschii DNA library by insertion into lambda clones as described in Example 
1 below. Primers flanking an ORF described in Table 2(a) or 3 can then be 
generated using the nucleotide sequence information provided in SEQ ID NO: 1 , 
2, or 3. The polymerase chain reaction (PGR) is then used to amplify and isolate 
the ORF from the lambda DNA library. PGR cloning is well known in the art. 
Thus, given SEQ ID NOs: 1 , 2, and 3, and Tables 2(a) and 3, it would be routine 
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to isolate any ORF or other representative fragment of the M jannaschii genome. 
Isolated nucleic acid molecules of the present invention include, but are not 
limited to, single stranded and double stranded DNA, and single stranded RNA, 
and complements thereof 

Tables 2(a), 2(b) and 3 describe ORFs in the M jannaschii genome. In 
particular, Table 2(a) (pages 67-11 5 below^) indicates the location of ORFs (i.e,, 
the position) within the M jannaschii genome that putatively encode the recited 
protein based on homology matching with protein sequences from the organism 
appearing in parentheticals (see the fourth column of Table 2(a)). The first 
colunm of Table 2(a) provides a name for each ORF. The second and third 
columns in Table 2(a) indicate an ORF's position in the nucleotide sequence 
provided in SEQ ID NO: 1 , 2 or 3 . One of ordinary skill in the art will appreciate 
that the ORFs may be oriented in opposite directions in the M jannaschii 
genome. This is reflected in columns 2 and 3. The fifth column of Table 2(a) 
indicates the percent identity of the protein sequence encoded by an ORF to the 
corresponding protein sequence from the organism appearing in parentheticals in 
the fourth column. The sixth column of Table 2(a) indicates the percent 
similarity of the protein sequence encoded by an ORF to the corresponding 
protein sequence from the organism appearing in parentheticals in the fourth 
column. The concepts of percent identity and percent similarity of two 
polypeptide sequences are well understood in the art and are described in more 
detail below. The eighth column in Table 2(a) indicates the length of the ORF in 
nucleotides. Each identified gene has been assigned a putative cellular role 
category adapted from Riley (Riley, M., Microbiol Rev. 57:862 (1993)). 

Table 2(b) (page 116 below) provides the single ORF identified by the 
present inventors that matches a previously published M jannaschii gene. In 
particular, ORF MJ0479, which is 585 nucleotides in length and is positioned at 
nucleotides 1,050,508 to 1,049,948 in SEQ ID NO:l, shares 100% identity to the 
previously published M, jannaschii adenylate kinase gene. 

Table 3 (pages 117-150 below) provides ORFs of the M, jannaschii 
genome that did not elicit a homology match with a known sequence from either 
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M jannaschii or another organism. As above, the first column in Table 3 
provides the ORF name and the second and third columns indicate an ORF's 
position in SEQ ID NO: 1 , 2, or 3. 

Table 4 (page 151 below) provides genes of M jannaschii that contain 

inteins. 

In the above-described Tables, there are three groups of ORF names. The 
one thousand six hundred and eighty two ORFs named "MJ-" (MJ0001-MJ1682) 
were identified on the M Jannaschii chromosome (SEQ ID NO: 1 ). The forty four 
ORFs named "MJECL-" (MJECL01.MJECL44) were identified on the large ECE 
(SEQ ID NO:2). The twelve ORFs named "M JECS-" (MJECSO 1 -MJES 1 2) were 
identified on the small ECE (SEQ ID NO:3). 

Further details concerning the algorithms and criteria used for homology 
searches are provided in the Examples below. A skilled artisan can readily 
identify ORFs in the Methanococcus jannaschii genome other than those listed 
in Tables 2(a), 2(b) and 3, such as ORFs that are overlapping or encoded by the 
opposite strand of an identified ORF in addition to those ascertainable using the 
computer-based systems of the present invention. 

Isolated nucleic acid molecules of the present invention include DNA 
molecules having a nucleotide sequence substantially different than the nucleotide 
sequence of an ORF described in Table 2(a) or 3, but which, due to the 
degeneracy of the genetic code, still encode a M. jannaschii protein. The genetic 
code is well known in the art. Thus, it would be routine to generate such 
degenerate variants. 

The present invention further relates to variants of the nucleic acid 
molecules of the present invention, which encode portions, analogs or derivatives 
of a M Jannaschii protein encoded by an ORF described in Table 2(a) or 3. 
Non-naturally occurring variants may be produced using art-known mutagenesis 
techniques and include those produced by nucleotide substitutions, deletions or 
additions. The substitutions, deletions or additions may involve one or more 
nucleotides. The variants may be altered in coding regions, non-coding regions, 
or both. Alterations in the coding regions may produce conservative or 
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non-conservative amino acid substitutions, deletions or additions. Especially 
preferred among these are silent substitutions, additions and deletions, which do 
not alter the properties and activities of the M jannaschii protein or portions 
thereof. Also especially preferred in this regard are conservative substitutions. 

Further embodiments of the invention include isolated nucleic acid 
molecules comprising a polynucleotide having a nucleotide sequence at least 90% 
identical, and more preferably at least 95%, 96%, 97%, 98% or 99% identical, to 
(a) the nucleotide sequence of an ORF described in Table 2(a) or 3, (b) the 
nucleotide sequence of an ORP described in Table 2(a) or 3, but lacking the 
codon for the N-terminal methionine residue, if present, or (c) a nucleotide 
sequence complementary to any of the nucleotide sequences in (a) or (b). By a 
polynucleotide having a nucleotide sequence at least, for example, 95% identical 
to the reference M. jannaschii ORF nucleotide sequence is intended that the 
nucleotide sequence of the polynucleotide is identical to the reference sequence 
except that the polynucleotide sequence may include up to five point mutations 
per each 100 nucleotides of the ORF sequence. In other words, to obtain a 
polynucleotide having a nucleotide sequence at least 95% identical to a reference 
ORF nucleotide sequence, up to 5% of the nucleotides in the reference sequence 
may be deleted or substituted with another nucleotide, or a number of nucleotides 
up to 5% of the total nucleotides in the reference sequence may be inserted into 
the reference sequence. These mutations of the reference sequence may occur at 
the 5 ' or 3 ' terminal positions of the reference nucleotide sequence or anywhere 
between those terminal positions, interspersed either individually among 
nucleotides in the reference sequence or in one or more contiguous groups within 
the reference sequence. 

As a practical matter, whether any particular nucleic acid molecule is at 
least 90%, 95%, 96%, 97%, 98% or 99% identical to the nucleotide sequence of 
a M jannaschii ORF can be determined conventionally using known computer 
programs such as the Bestfit program (Wisconsin Sequence Analysis Package, 
Version 8 for Unix, Genetics Computer Group, University Research Park, 575 
Science Drive, Madison, WI 5371 1). Bestfit uses the local homology algorithm 
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of Smith and Waterman, Advances in Applied Mathematics 2: 482-489 (1981), 
to find the best segment of homology between two sequences. When using 
Bestfit or any other sequence alignment program to determine whether a 
particular sequence is, for instance, 95% identical to a reference sequence 
according to the present invention, the parameters are set, of course, such that the 
percentage of identity is calculated over the full length of the reference nucleotide 
sequence and that gaps in homology of up to 5% of the total number of 
nucleotides in the reference sequence are allowed. 

Preferred are nucleic acid molecules having sequences at least 90%, 95%, 
96%, 97%, 98% or 99% identical to the nucleic acid sequence of a M. jannaschii 
ORF that encode a functional polypeptide. By a "functional polypeptide" is 
intended a polypeptide exhibiting activity similar, but not necessarily identical, 
to an activity of the protein encoded by the M. jannaschii ORF. For example, the 
M. jannaschii ORF MJ 1 434 encodes an endonuclease that degrades DNA. Thus, 
a "functional polypeptide" encoded by a nucleic acid molecule having a 
nucleotide sequence, for example, 95% identical to the nucleotide sequence of 
MJ1434, will also degrade DNA. As the skilled artisan will appreciate, assays for 
determining whether a particular polypeptide is "functional" will depend on 
which ORF is used as the reference sequence. Depending on the reference ORF, 
the assay chosen for measuring polypeptide activity will be readily apparent in 
light of the role categories provided in Table 2(a). 

Of course, due to the degeneracy of the genetic code, one of ordinary skill 
in the art will immediately recognize that a large number of the nucleic acid 
molecules having a sequence at least 90%, 95%, 96%, 97%, 98%, or 99% 
identical to the nucleic acid sequence of a reference ORF will encode a functional 
polypeptide. In fact, since degenerate variants all encode the same amino acid 
sequence, this will be clear to the skilled artisan even without performing a 
comparison assay for protein activity. It will be further recognized in the art that, 
for such nucleic acid molecules that are not degenerate variants, a reasonable 
number will also encode a functional polypeptide. This is because the skilled 
artisan is fully aware of amino acid substitutions that are either less likely or not 
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likely to significantly affect protein function (e.g., replacing one aliphatic amino 
acid with a second aliphatic amino acid). 

For example, guidance concerning how to make phenotypically silent 
amino acid substitutions is provided in Bowie, J. U. et al., "Deciphering the 
Message in Protein Sequences: Tolerance to Amino Acid Substitutions," Science 
2^7; 1306-1310 (1990), wherein the authors indicate that there are two main 
approaches for studying the tolerance of an amino acid sequence to change. The 
first method relies on the process of evolution, in which mutations are either 
accepted or rejected by natural selection. The second approach uses genetic 
engineering to introduce amino acid changes at specific positions of a cloned gene 
and selections or screens to identify sequences that maintain functionality. As the 
authors state, these studies have revealed that proteins are surprisingly tolerant of 
amino acid substitutions. The authors further indicate which amino acid changes 
are likely to be permissive at a certain position of the protein. For example, most 
buried amino acid residues require nonpolar side chains, whereas few features of 
surface side chains are generally conserved. Other such phenotypically silent 
substitutions are described in Bowie, J.U. et al, supra, and the references cited 
therein. 

The present invention is further directed to fragments of the isolated 
nucleic acid molecules described herein. By a fragment of an isolated nucleic 
acid molecule having the nucleotide sequence of a M jannaschii ORF is intended 
fragments at least about 1 5 nt, and more preferably at least about 20 nt, still more 
preferably at least about 30 nt, and even more preferably, at least about 40 nt in 
length that are useful as diagnostic probes and primers as discussed herein. Of 
course, larger fragments 50-500 nt in length are also useful according to the 
present invention as are fragments corresponding to most, if not all, of the 
nucleotide sequence of a M jannaschii ORF. By a fragment at least 20 nt in 
length, for example, is intended fragments that include 20 or more contiguous 
bases from the nucleotide sequence of a M jannaschii ORF. Since M jannaschii 
ORFs are listed in Tables 2(a) and 3 and the genome sequence has been provided, 
generating such DNA fragments would be routine to the skilled artisan. For 
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example, restriction endonuclease cleavage or shearing by sonication could easily 
be used to generate fragments of various sizes. Alternatively, such fragments 
could be generated synthetically. 

Preferred nucleic acid fragments of the present invention include nucleic 
acid molecules encoding epitope-bearing portions of a M jannaschii protein. 
Methods for determining such epitope-bearing portions are described in detail 
below. 

In another aspect, the invention provides an isolated nucleic acid molecule 
comprising a polynucleotide that hybridizes under stringent hybridization 
conditions to a portion of the polynucleotide in a nucleic acid molecule of the 
invention described above, for instance, an ORF described in Table 2(a) or 3. By 
"stringent hybridization conditions" is intended overnight incubation at 42 °C in 
a solution comprising: 50% formamide, 5x SSC (150 mM NaCl, 15mM trisodium 
citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardfs solution, 10% dextran 
sulfate, and 20 g/ml denatured, sheared salmon sperm DNA, followed by washing 
the filters in 0. Ix SSC at about 65 °C. 

By a polynucleotide that hybridizes to a "portion" of a polynucleotide is 
intended a polynucleotide (either DNA or RNA) hybridizing to at least about 15 
nucleotides (nt), and more preferably at least about 20 nt, still more preferably at 
least about 30 nt, and even more preferably about 30-70 nt of the reference 
polynucleotide. These are useful as diagnostic probes and primers as discussed 
above and in more detail below. 

Of course, polynucleotides hybridizing to a larger portion of the reference 
polynucleotide (e.g., a M. jannaschii ORF), for instance, a portion 50-500 nt in 
length, or even to the entire length of the reference polynucleotide, are also useful 
as probes according to the present invention, as are polynucleotides 
corresponding to most, if not all, of sl M. jannaschii ORF. 
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By "expression modulating fragment" (EMF), is intended a series of 
nucleotides that modulate the expression of an operably linked ORF or EMF. A 
sequence is said to "modulate the expression of an operably linked sequence" 
when the expression of the sequence is altered by the presence of the EMF, 
EMFs include, but are not limited to, promoters, and promoter modulating 
sequences (inducible elements). One class of EMFs are fragments that induce the 
expression of an operably linked ORF in response to a specific regulatory factor 
or physiological event. EMF sequences can be identified within the M 
jannaschii genome by their proximity to the ORPs described in Tables 2(a), 2(b), 
and 3. An intergenic segment, or a fragment of the intergenic segment, from 
about 10 to 200 nucleotides in length, taken 5' from any one of the ORFs of 
Tables 2(a), 2(b) or 3 will modulate the expression of an operably linked 3' ORF 
in a fashion similar to that found with the naturally linked ORF sequence. As 
used herein, an "intergenic segment" refers to the fragments of the M. jannaschii 
genome that are between two ORF(s) herein described. Alternatively, EMFs can 
be identified using known EMFs as a target sequence or target motif in the 
computer-based systems of the present invention. 

The presence and activity, of an EMF can be confirmed using an EMF trap 
vector. An EMF trap vector contains a cloning site 5' to a marker sequence. A 
marker sequence encodes an identifiable phenotype, such as antibiotic resistance 
or a complementing nutrition auxotrophic factor, which can be identified or 
assayed when the EMF trap vector is placed within an appropriate host under 
appropriate conditions. As described above, an EMF will modulate the 
expression of an operably linked marker sequence. A more detailed discussion 
of various marker sequences is provided below. 

A sequence that is suspected as being an EMF is cloned in all three 
reading frames in one or more restriction sites upstream from the marker 
sequence in the EMF trap vector. The vector is then transformed into an 
appropriate host using known procedures and the phenotype of the transformed 
host in examined under appropriate conditions. As described above, an EMF will 
modulate the expression of an operably linked marker sequence. 
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By "uptake modulating fragment" (UMF), is intended a series of 
nucleotides that mediate the uptake of a Hnked DNA fragment into a cell. UMFs 
can be readily identified using known UMFs as a target sequence or target motif 
with the computer-based systems described below. The presence and activity of 
a UMF can be confirmed by attaching the suspected UMF to a marker sequence. 
The resulting nucleic acid molecule is then incubated with an appropriate host 
under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake 
of a linked marker sequence. 

By a "diagnostic fragment" (DF), is intended a series of nucleotides that 
selectively hybridize to M jannaschii sequences. DFs can be readily identified 
by identifying unique sequences within the M. jannaschii genome, or by 
generating and testing probes or amplification primers consisting of the DF 
sequence in an appropriate diagnostic format for amplification or hybridization 
selectivity. 

Each of the ORFs of the M Jannaschii genome disclosed in Tables 2(a) 
and 3, and the EMF found 5' to the ORF, can be used in numerous ways as 
polynucleotide reagents. The sequences can be used as diagnostic probes or 
diagnostic amplification primers to detect the presence M, jannaschii in a sample. 
This is especially the case with the fragments or ORFs of Table 3, which will be 
highly selective for M jannaschii. 

In addition, the fragments of the present invention, as broadly described, 
can be used to control gene expression through triple helix formation or anlisense 
DNA or RNA, both of which methods are based on the binding of a 
polynucleotide sequence to DNA or RNA. Polynucleotides suitable for use in 
these methods are usually 20 to 40 bases in length and are designed to be 
complementary to a region of the gene involved in transcription (triple helix - see 
Lee et ai , Nucl Acids Res. 6:3073 (1 979); Cooney et al , Science 24 1 :456 ( 1 988); 
and Dervan et ai. Science 257:1360 (1991)) or to the mRNA itself (antisense - 
Okano, J, Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense 
Inhibitors of Gene Expression, CRC Press, Boca Raton, FL (1988)). 
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Triple helix- formation optimally results in a shut-off of RNA 
transcription from DNA, while antisense RNA hybridization blocks translation 
of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the 
5 sequences of the present invention is necessary for the design of an antisense or 

triple helix oligonucleotide. 

Vectors and Host Cells 

The present invention fiirther provides recombinant constructs comprising 
one or more fragments of the M. jannaschii genome. The recombinant constructs 

10 of the present invention comprise a vector, such as a plasmid or viral vector, into 

which, for example, a M jannaschii ORF is inserted. The vector may fiirther 
comprise regulatory sequences, including for example, a promoter, operably 
linked to the ORF. For vectors comprising the EMFs and UMFs of the present 
invention, the vector may further comprise a marker sequence or heterologous 

1 5 ORF operably linked to the EMF or UMF. Large numbers of suitable vectors and 

promoters are known to those of skill in the art and are conunercially available 
for generating the recombinant constructs of the present invention. The following 
vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, 
pBluescript SK, pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); 

20 pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia). Eukaryotic: 

pWLneo, pSV2cat, pOG44, pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, 
pSVL (Pharmacia). 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. 

25 Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial 

promoters include lad, lacZ, T3, T7, gpt, lambda P^, and trc. Eukaryotic 
promoters include CN4V immediate early, HSV thymidine kinase, early and late 
SV40, LTRs from retrovirus, and mouse metallothionein-L Selection of the 
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appropriate vector and promoter is well within the level of ordinary skill in the 
art. 

The present invention further provides host cells containing any one of the 
isolated fragments (preferably an ORF) of the M. jannaschii genome described 
herein. The host cell can be a higher eukaryotic host cell, such as a mammalian 
cell, a lower eukaryotic host cell, such as a yeast cell, or the host cell can be a 
procaryotic cell, such as a bacterial cell. Introduction of the recombinant 
construct into the host cell can be effected by calcium phosphate transfection, 
DEAE, dextran mediated transfection, or electroporation (Davis, L. et ai, Basic 
Methods in Molecular Biology (1 986)). Host cells containing, for example, a M. 
jannaschii ORF can be used conventionally to produce the encoded protein. 



Polypeptides and Fragments 



The invention further provides an isolated polypeptide encoded by a M. 
jannaschii ORF described in Tables 2(a) or 3, or a peptide or polypeptide 
comprising a portion of the isolated polypeptide. The terms "peptide" and 
"oligopeptide" are considered synonymous (as is commonly recognized) and each 
term can be used interchangeably as the context requires to indicate a chain of at 
least two amino acids coupled by peptidyl linkages. The word "polypeptide" is 
used herein for chains containing more than ten amino acid residues. 

It will be recognized in the art that some amino acid sequence of the M. 
jannaschii polypeptide can be varied without significant affect of the structure or 
function of the protein. If such differences in sequence are contemplated, it 
should be remembered that there will be critical areas on the protein which 
determine activity. In general, it is possible to replace residues which form the 
tertiary structure, provided that residues performing a similar function are used. 
In other instances, the type of residue may be completely unimportant if the 
alteration occurs at a non-critical region of the protein. 

Thus, the invention further includes variations of a M. jannaschii protein 
encoded by an ORF described in Table 2(a) or 3 that show substantial protein 
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activity. Methods for assaying such "functional polypeptides" for protein activity 
are described above. Variations include deletions, insertions, inversions, repeats, 
and type substitutions (for example, substituting one hydrophilic residue for 
another, but not strongly hydrophilic for strongly hydrophobic as a rule). Small 
changes or such "neutral" amino acid substitutions will generally have little effect 
on protein activity. 

Typically seen as conservative substitutions are the replacements, one for 
another, among the aliphatic amino acids Ala, Val, Leu and He; interchange of the 
hydroxyl residues Ser and Thr, exchange of the acidic residues Asp and Glu, 
substitution between the amide residues Asn and Gin, exchange of the basic 
residues Lys and Arg and replacements among the aromatic residues Phe, Tyr. 

As indicated in detail above, further guidance concerning amino acid 
changes that are likely to be phenotypically silent (i.e., are not likely to have a 
significant deleterious effect on function) can be found in Bowie, J.U., et al, 
"Deciphering the Message in Protein Sequences: Tolerance to Amino Acid 
Substitutions," Science 2^7; 1306-13 10 (1990). 

The fragment, derivative, variant or analog of a Mj'annaschii polypeptide 
encoded by an ORF described in Table 2(a) or 3, may be (i) one in which one or 
more of the amino acid residues are substituted with a conserved or non- 
conserved amino acid residue (preferably a conserved amino acid residue) and 
such substituted amino acid residue may or may not be one encoded by the 
genetic code, or (ii) one in which one or more of the amino acid residues includes 
a substituent group, or (iii) one in which the polypeptide is fused with another 
compound, such as a compound to increase the half-life of the polypeptide (for 
example, polyethylene glycol), or (iv) one in which the additional amino acids are 
fused to the polypeptide, such as an IgG Fc fusion region peptide or leader or 
secretory sequence or a sequence which is employed for purification of the 
polypeptide or a proprotein sequence. Such fragments, derivatives and analogs 
are deemed to be within the scope of those skilled in the art from the teachings 
herein. 
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Of particular interest are substitutions of charged amino acids with 
another charged amino acid and with neutral or negatively charged amino acids. 
The latter results in proteins with reduced positive charge to improve the 
characteristics of a M jannaschii ORF-encoded protein. The prevention of 
aggregation is highly desirable. Aggregation of proteins not only resuhs in a loss 
of activity but can also be problematic when preparing pharmaceutical 
formulations, because they can be immunogenic. (Pinckard et al, Clin. Exp. 
Immunol. 2:33 1-340(1 967); Robbinse/ a/., Diabetes 3 6:%?,^-SA5 (1987); Cleland 
et al Crit. Rev. Therapeutic Drug Carrier Systems 70:307-377 (1993)). 

As indicated, changes are preferably of a minor nature, such as 
conservative amino acid substitutions that do not significantly affect the folding 
or activity of the protein (see Table 1 ). 



TABLE 1 . Conservative Amino Acid Substitutions. 



Aromatic 


Phenylalanine 




Tryptophan 




Tyrosine 


Hydrophobic 


Leucine 




Isoleucine 




Valine 


Polar 


Glutamine 




Asparagine 


Basic 


Arginine 




Lysine 




Histidine 


Acidic 


Aspartic Acid 




Glutamic Acid 


Small 


Alanine 




Serine 




Threonine 




Methionine 




Glycine 



Amino acids in a M Jannaschii ORF-encoded protein of the present 
invention that are essential for function can be identified by methods known in 
the art, such as site-directed mutagenesis or alanine-scanning mutagenesis 
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(Cunningham and Wells, Science 244:\0UA0%5 (1989)). The latter procedure 
introduces single alanine mutations at every residue in the molecule. 

The polypeptides of the present invention are preferably provided in an 
isolated form. By "isolated polypeptide" is intended a polypeptide removed from 
its native environment. Thus, a polypeptide produced and/or contained within a 
recombinant host cell is considered isolated for purposes of the present invention. 
Also intended as an "isolated polypeptide" are polypeptides that have been 
purified, partially or substantially, from a recombinant host cell. For example, 
a recombinantly produced version of a M jannaschii ORF-encoded protein can 
be substantially purified by the one-step method described in Smith and Johnson, 
G^we 67-31-40 (1988). 

The polypeptides of the present invention include the proteins encoded by 
(a) an ORF described in Table 2(a) or 3 or (b) an ORF described in Table 2(a) or 
3, but minus the codon for the N-terminal methionine residue, if present, as well 
as polypeptides that have at least 90% similarity, more preferably at least 95% 
similarity, and still more preferably at least 96%, 97%, 98% or 99% similarity to 
a M jannaschii ORF-encoded protein. Further polypeptides of the present 
invention include polypeptides at least 90% identical, more preferably at least 
95% identical, still more preferably at least 96%, 97%, 98% or 99% identical to 
a M jannaschii ORF-encoded protein. 

By "% similarity" for two polypeptides is intended a similarity score 
produced by comparing the amino acid sequences of the two polypeptides using 
the Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for Unix, 
Genetics Computer Group, University Research Park, 575 Science Drive, 
Madison, WI 5371 1) and the default settings for determining similarity. Bestfit 
uses the local homology algorithm of Smith and Waterman (Advances in Applied 
Mathematics 2:482-489, 1981) to find the best segment of similarity between two 
sequences. 

By a polypeptide having an amino acid sequence at least, for example, 
95% "identical" to a reference amino acid sequence of a M jannaschii ORF- 
encoded protein is intended that the amino acid sequence of the polypeptide is 
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identical to the reference sequence except that the polypeptide sequence may 
include up to five amino acid alterations per each 100 amino acids of the 
reference sequence. In other words, to obtain a polypeptide having an amino acid 
sequence at least 95% identical to a reference amino acid sequence, up to 5% of 
the amino acid residues in the reference sequence may be deleted or substituted 
with another amino acid, or a number of amino acids up to 5% of the total amino 
acid residues in the reference sequence may be inserted into the reference 
sequence. These alterations of the reference sequence may occur at the amino or 
carboxy terminal positions of the reference amino acid sequence or anywhere 
between those terminal positions, interspersed either individually among residues 
in the reference sequence or in one or more contiguous groups within the 
reference sequence. 

As a practical matter, whether any particular polypeptide has an amino 
acid sequence at least 90%, 95%, 96%, 97%, 98% or 99% identical to the amino 
acid sequence of a M jannaschii ORF-encoded protein can be determined 
conventionally using known computer programs such the Bestfit program 
(Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics Computer 
Group, University Research Park, 575 Science Drive, Madison, WI 53711). 
When using Bestfit or any other sequence alignment program to determine 
whether a particular sequence is, for instance, 95% identical to a reference 
sequence according to the present invention, the parameters are set, of course, 
such that the percentage of identity is calculated over the fiill length of the 
reference amino acid sequence and that gaps in homology of up to 5% of the total 
number of amino acid residues in the reference sequence are allowed. 

As described in detail below, the polypeptides of the present invention can 
also be used to raise polyclonal and monoclonal antibodies, which are useful in 
assays for detecting M. jannaschii protein expression. 

In another aspect, the invention provides a peptide or polypeptide 
comprising an epitope-bearing portion of a polypeptide of the invention. The 
epitope of this polypeptide portion is an immunogenic or antigenic epitope of a 
polypeptide of the invention. An "immunogenic epitope" is defined as a part of 
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a protein that elicits an antibody response when the whole protein is the 
immunogen. These immunogenic epitopes are believed to be confined to a few 
loci on the molecule. On the other hand, a region of a protein molecule to which 
an antibody can bind is defined as an "antigenic epitope." The number of 
immunogenic epitopes of a protein generally is less than the number of antigenic 
epitopes. See, for instance, Geysen etal, Proc. Natl Acad. Set USA 57:3998- 
4002(1983), 

As to the selection of peptides or polypeptides bearing an antigenic 
epitope (i.e., that contain a region of a protein molecule to which an antibody can 
bind), it is well known in that art that relatively short synthetic peptides that 
mimic part of a protein sequence are routinely capable of eliciting an antiserum 
that reacts with the partially mimicked protein. See, for instance, Sutcliffe, J. G,, 
Shinnick, T. M., Green, N. and Learner, R.A. (1983). Antibodies that react with 
predetermined sites on proteins are described in Science 27 P. 660-666. Peptides 
capable of eliciting protein-reactive sera are frequently represented in the primary 
sequence of a protein, can be characterized by a set of simple chemical rules, and 
are confined neither to inamunodominant regions of intact proteins (i.e., 
immunogenic epitopes) nor to the amino or carboxyl terminals. Peptides that are 
extremely hydrophobic and those of six or fewer residues generally are ineffective 
at inducing antibodies that bind to the mimicked protein; longer, peptides, 
especially those containing proline residues, usually are effective. Sutcliffe et al , 
supra, at 661. For instance, 18 of 20 peptides designed according to these 
guidelines, containing 8-39 residues covering 75% of the sequence of the 
influenza virus hemagglutinin HAl polypeptide chain, induced antibodies that 
reacted with the HAl protein or intact virus; and 12/12 peptides from the MuLV 
polymerase and 18/18 from the rabies glycoprotein induced antibodies that 
precipitated the respective proteins. 

Antigenic epitope-bearing peptides and polypeptides of the invention are 
therefore useful to raise antibodies, including monoclonal antibodies, that bind 
specifically to a polypeptide of the invention. Thus, a high proportion of 
hybridomas obtained by fusion of spleen cells from donors immunized with an 
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antigen epitope-bearing peptide generally secrete antibody reactive with the 
native protein. Sutcliffe et al, supra, at 663. The antibodies raised by antigenic 
epitope-bearing peptides or polypeptides are useftil to detect the mimicked 
protein, and antibodies to different peptides may be used for tracking the fate of 
various regions of a protein precursor which undergoes post-transiational 
processing. The peptides and anti-peptide antibodies may be used in a variety of 
qualitative or quantitative assays for the mimicked protein, for instance in 
competition assays since it has been shown that even short peptides (e.g., about 
9 amino acids) can bind and displace the larger peptides in immunoprecipitation 
assays. See, for instance, Wilson et al, Cell 57:767-778 (1984) at 777. The anti- 
peptide antibodies of the invention also are useful for purification of the 
mimicked protein, for instance, by adsorption chromatography using methods 
well known in the art. 

Antigenic epitope-bearing peptides and polypeptides of the invention 
designed according to the above guidelines preferably contain a sequence of at 
least seven, more preferably at least nine and most preferably between about 15 
to about 30 amino acids contained within the amino acid sequence of a 
polypeptide of the invention. However, peptides or polypeptides comprising a 
larger portion of an amino acid sequence of a polypeptide of the invention, 
containing about 30 to about 50 amino acids, or any length up to and including 
the entire amino acid sequence of a polypeptide of the invention, also are 
considered epitope-bearing peptides or polypeptides of the invention and also are 
useful for inducing antibodies that react with the mimicked protein. Preferably, 
the amino acid sequence of the epitope-bearing peptide is selected to provide 
substantial solubility in aqueous solvents (i.e., the sequence includes relatively 
hydrophilic residues and highly hydrophobic sequences are preferably avoided); 
and sequences containing proline residues are particularly preferred. 

The epitope-bearing peptides and polypeptides of the invention may be 
produced by any conventional means for making peptides or polypeptides 
including recombinant means using nucleic acid molecules of the invention. For 
instance, a short epitope-bearing amino acid sequence may be fused to a larger 
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polypeptide which acts as a carrier during recombinant production and 
purification, as well as during immunization to produce anti-peptide antibodies. 
Epitope-bearing peptides also may be synthesized using known methods of 
chemical synthesis. For instance, Houghten has described a simple method for 
5 synthesis of large numbers of peptides, such as 10-20 mg of 248 different 13 

residue peptides representing single amino acid variants of a segment of the HAl 
polypeptide which were prepared and characterized (by ELISA-type binding 
studies) in less than four weeks. Houghten, R. A. (1985) General method for 
the rapid solid-phase synthesis of large numbers of peptides: specificity of 

1 0 antigen-antibody interaction at the level of individual amino acids. Proc. Natl 

Acad Sci. USA ^2:5131-5135. This "Simultaneous Multiple Peptide Synthesis 
(SMPS)" process is further described in U.S. Patent No. 4,631,21 1 to Houghten 
et al (1986). In this procedure the individual resins for the solid-phase synthesis 
of various peptides are contained in separate solvent-permeable packets, enabling 

15 the optimal use of the many identical repetitive steps involved in solid-phase 

methods, A completely manual procedure allows 500-1000 or more syntheses to 
be conducted simultaneously. Houghten et al , supra, at 5 1 34. 

Epitope-bearing peptides and polypeptides of the invention are used to 
induce antibodies according to methods well known in the art. See, for instance, 

20 Sutcliffe et al., supra; Wilson et al, supra; Chow, M. et al, Proc, Natl Acad. 

Set OS/I 52:910-914; and Bittle,F. J. a/., J. Gen. Virol (56:2347-2354(1985). 
Generally, animals may be immimized with free peptide; however, anti-peptide 
antibody titer may be boosted by coupling of the peptide to a macromolecular 
carrier, such as keyhole limpet hemacyanin (KLH) or tetanus toxoid. For 

25 instance, peptides containing cysteine may be coupled to carrier using a linker 

such as m-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS), while other 
peptides may be coupled to carrier using a more general linking agent such as 
glutaraldehyde. Animals such as rabbits, rats and mice are immunized with either 
free or carrier-coupled peptides, for instance, by intraperitoneal and/or 

30 intradermal injection of emulsions containing about 100 g peptide or carrier 

protein and Freund^s adjuvant. Several booster injections may be needed, for 
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instance, at intervals of about two weeks, to provide a useful titer of anti-peptide 
antibody which can be detected, for example, by ELISA assay using free peptide 
adsorbed to a solid surface. The titer of anti-peptide antibodies in serum from an 
immunized animal may be increased by selection of anti-peptide antibodies, for 
instance, by adsorption to the peptide on a solid support and elution of the 
selected antibodies according to methods well knovm in the art. 

Immunogenic epitope-bearing peptides of the invention, i.e., those parts 
of a protein that eUcit an antibody response when the whole protein is the 
immunogen, are identified according to methods known in the art. For instance, 
Geysen et al, supra, discloses a procedure for rapid concurrent synthesis on solid 
supports of hundreds of peptides of sufficient purity to react in an enzyme-linked 
immunosorbent assay. Interaction of synthesized peptides with antibodies is then 
easily detected without removing them from the support. In this manner a peptide 
bearing an immunogenic epitope of a desired protein may be identified routinely 
by one of ordinary skill in the art. For instance, the immunologically important 
epitope in the coat protein of foot-and-mouth disease virus was located by Geysen 
et aL with a resolution of seven amino acids by synthesis of an overlapping set of 
all 208 possible hexapeptides covering the entire 213 amino acid sequence of the 
protein. Then, a complete replacement set of peptides in which all 20 amino 
acids were substituted in turn at every position within the epitope were 
synthesized, and the particular amino acids conferring specificity for the reaction 
with antibody were determined. Thus, peptide analogs of the epitope-bearing 
peptides of the invention can be made routinely by this method. U.S. Patent No. 
4,708,781 to Geysen (1987) further describes this method of identifying a peptide 
bearing an immunogenic epitope of a desired protein. 

Further still, U.S. Patent No. 5,194,392 to Geysen (1990) describes a 
general method of detecting or determining the sequence of monomers (amino 
acids or other compounds) which is a topological equivalent of the epitope (i.e., 
a "mimotope") which is complementary to a particular paratope (antigen binding 
site) of an antibody of interest. More generally, U.S. Patent No. 4,433,092 to 
Geysen (1989) describes a method of detecting or determining a sequence of 
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monomers which is a topographical equivalent of a ligand which is 
complementary to the ligand binding site of a particular receptor of interest. 
Similarly, U.S. Patent No. 5,480,971 to Houghten, R. A. et al (1996) on 
Peralkylated Oligopeptide Mixtures discloses linear Cj-C^-alkyl peralkylated 
oligopeptides and sets and libraries of such peptides, as well as methods for using 
such oligopeptide sets and libraries for determining the sequence of a peralkylated 
oligopeptide that preferentially binds to an acceptor molecule of interest. Thus, 
non-peptide analogs of the epitope-bearing peptides of the invention also can be 
made routinely by these methods. 

The entire disclosure of each document cited in this section on 
"Polypeptides and Peptides" is hereby incorporated herein by reference. 

As one of skill in the art will appreciate, the polypeptides of the present 
invention and the epitope-bearing fragments thereof described above can be 
combined with parts of the constant domain of immunoglobulins (IgG), resulting 
in chimeric polypeptides. These fusion proteins facilitate purification and show 
an increased half-life in vivo. This has been demonstrated, e.g., for chimeric 
proteins consisting of the first two domains of the human CD4-polypeptide and 
various domains of the constant regions of the heavy or light chains of 
mammalian immunoglobulins (EPA 394,827; Traunecker et al, Nature 557:84- 
86 (1988)). Fusion proteins that have a disulfide-linked dimeric structure due to 
the IgG part can also be more efficient in binding and neutralizing other 
molecules than the monomeric protein or protein fragment alone (Fountoulakis 
et aL JBiochem 270:3958-3964 (1995)). 

Protein Function 

Each ORF described in Table 2(a) was assigned to biological role 
categories adapted from Riley, M., Microbiology Reviews 57(4) :S62 (1993)). This 
allows the skilled artisan to determine a function for each identified coding 
sequence. For example, a partial list of the M. jannaschii protein functions 
provided in Table 2(a) includes: methanogenesis, amino acid biosynthesis, cell 
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division, detoxification, protein secretion, transformation, central intermediary 
metabolism, energy metabolism, degradation of DNA, DNA replication, 
restriction, modification, recombination and repair, transcription, RNA 
processing, translation, degradation of proteins, peptides and glycopeptides, 
ribosomal proteins, translation factors, transport, tRNA modification, and drug 
and analog sensitivity. A more detailed description of several of these functions 
is provided in Example 1 below. 



Diagnostic Assays 



The present invention further provides methods to identify the expression 
of an ORF of the present invention, or homolog thereof, in a test sample, using 
one of the DFs or antibodies of the present invention. Such methods involve 
incubating a test sample with one or more of the antibodies or one or more of the 
DFs of the present invention and assaying for binding of the DFs or antibodies to 
components within the test sample. 

Conditions for incubating a DF or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection 
methods employed, and the type and nature of the DF or antibody used in the 
assay. One skilled in the art will recognize that any one of the commonly 
available hybridization, amplification or immunological assay formats can readily 
be adapted to employ the DFs or antibodies of the present invention. Examples 
of such assays can be found in Chard, T., An Introduction to Radioimmunoassay 
and Related Techniques, Elsevier Science Publishers, Amsterdam, The 
Netherlands (1986); Bullock, G.R. et ai. Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); 
Tijssen, P., Practice and Theory of Enzyme Immunoassays. Laboratory 
Techniques in Biochemistry and Molecular Biology, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1985). 

The test samples of the present invention include cells, protein or 
membrane extracts of cells. The test sample used in the above-described method 
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will vary based on the assay format, nature of the detection method and the cells 
or extracts used as the sample to be assayed. Methods for preparing protein 
extracts or membrane extracts of cells are well known in the art and can be 
readily be adapted in order to obtain a sample which is compatible with the 
system utilized. 

In another embodiment of the present invention, kits are provided which 
contain the necessary reagents to carry out the assays of the present invention. 
Specifically, the invention provides a compartmentalized kit to receive, in close 
confinement, one or more containers including comprising: (a) a first container 
comprising one of the DFs or antibodies of the present invention; and (b) one or 
more other containers comprising one or more of the following: wash reagents, 
reagents capable of detecting presence of a bound DF or antibody, 

A compartmentalized kit includes any kit in which reagents are contained 
in separate containers. Such containers include small glass containers, plastic 
containers or strips of plastic or paper. Such containers allow one to efficiently 
transfer reagents from one compartment to another compartment such that the 
samples and reagents are not cross-contaminated, and the agents or solutions of 
each container can be added in a quantitative fashion fi:om one compartment to 
another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers 
which contain wash reagents (such as phosphate buffered saline, Tris-buffers, 
etc.), and contfdners which contain the reagents used to detect the bound antibody 
or DF. 

Types of detection reagents include labeled nucleic acid probes, labeled 
secondary antibodies, or in the alternative, if the primary antibody is labeled, the 
enzymatic, or antibody binding reagents that are capable of reacting with the 
labeled antibody. One skilled in the art will readily recognize that the disclosed 
DFs and antibodies of the present invention can be readily incorporated into one 
of the established kit formats that are well known in the art. 
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Screening Assay for Binding Agents 

Using the isolated proteins described herein, the present invention further 
provides methods of obtaining and identifying agents that bind to a protein 
encoded by a M jannaschii ORF or to a fragment thereof. 

The method involves: 

(a) contacting an agent with an isolated protein encoded by a M 
jannaschii ORF, or an isolated fragment thereof; and 

(b) determining whether the agent binds to said protein or said 
fragment. 

The agents screened in the above assay can be, but are not limited to, 
peptides, carbohydrates, vitamin derivatives, or other pharmaceutical agents. The 
agents can be selected and screened at random or rationally selected or designed 
using protein modeling techniques. For random screening, agents such as 
peptides, carbohydrates, pharmaceutical agents and the like are selected at 
random and are assayed for their ability to bind to the protein encoded by an ORF 
of the present invention. 

Alternatively, agents may be rationally selected or designed. As used 
herein, an agent is said to be "rationally selected or designed" when the agent is 
chosen based on the configuration of the particular protein. For example, one 
skilled in the art can readily adapt currently available procedures to generate 
peptides, pharmaceutical agents and the like capable of binding to a specific 
peptide sequence in order to generate rationally designed antipeptide peptides, for 
example see Hurby et al. Application of Synthetic Peptides: Antisense Peptides, 
In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, 
and Kaspczak et al. Biochemistry 25:9230-8 (1989), or pharmaceutical agents, 
or the like. 

In addition to the foregoing, one class of agents of the present invention, 
can be used to control gene expression through binding to one of the ORFs or 
EMFs of the present invention. As described above, such agents can be randomly 



wo 98/07830 PCT/US97/14900 

-30- 

screened or rationally designed and selected. Targeting the ORF or EMF allows 
a skilled artisan to design sequence specific or element specific agents, 
modulating the expression of either a single ORF or multiple ORFs that rely on 
the same EMF for expression control. 

One class of DNA binding agents are those that contain nucleotide base 
residues that hybridize or form a triple helix by binding to DNA or RNA. Such 
agents can be based on the classic phosphodiester, ribonucleic acid backbone, or 
can be a variety of sulfhydryl or polymeric derivatives having base attachment 
capacity. 

Agents suitable for use in these methods usually contain 20 to 40 bases 
and are designed to be complementary to a region of the gene involved in 
transcription (triple helix - see Lee et al.Nucl Acids Res. 6:3073 (1979); Cooney 
et al. Science 241:456 (1988); and Dervan et al. Science 251: 1360 (1991)) or 
to the mRNA itself (antisense - Okano, J, Neurochem. 56:560 (1991); 
Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL (1 988)). Triple helix-formation optimally results in a shut-ofFof 
RNA transcription from DNA, while antisense RNA hybridization blocks 
translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the 
sequences of the present invention is necessary for the design of an antisense or 
triple helix oligonucleotide and other DNA binding agents. 

Computer Related Embodiments 

The nucleotide sequence provided in SEQ ID NO:l, 2, or 3, a 
representative fragment thereof, or a nucleotide sequence at least 99.9% identical 
to the sequence provided in SEQ ID N0:1 , 2, or 3, can be "provided" in a variety 
of mediums to facilitate use thereof As used herein, provided refers to a 
manufacture, other than an isolated nucleic acid molecule, that contains a 
nucleotide sequence of the present invention, i.e., the nucleotide sequence 
provided in SEQ ID NO:l, 2, or 3, a representative fragment thereof, or a 
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nucleotide sequence at least 99.9% identical to SEQ ID NO:l, 2, or 3. Such a 
manufacture provides the M. jannaschii genome or a subset thereof (e.g., a M. 
jannaschii open reading frame (ORF)) in a form that allows a skilled artisan to 
examine the manufacture using means not directly applicable to examining the 
M. jannaschii genome or a subset thereof as it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the 
present invention can be recorded on computer readable media. As used herein, 
"computer readable media" refers to any medium that can be read and accessed 
directly by a computer. Such media include, but are not limited to: magnetic 
storage media, such as floppy discs, hard disc storage medium, and magnetic 
tape; optical storage media such as CD-ROM; electrical storage media such as 
RAM and ROM; and hybrids of these categories such as magnetic/optical storage 
media. A skilled artisan can readily appreciate how any of the presently known 
computer readable mediums can be used to create a manufacture comprising 
computer readable medium having recorded thereon a nucleotide sequence of the 
present invention. 

As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the 
presently know methods for recording information on computer readable medium 
to generate manufactures comprising the nucleotide sequence information of the 
present invention. A variety of data storage structures are available to a skilled 
artisan for creating a computer readable medium having recorded thereon a 
nucleotide sequence of the present invention. The choice of the data storage 
structure will generally be based on the means chosen to access the stored 
information. In addition, a variety of data processor programs and formats can 
be used to store the nucleotide sequence information of the present invention on 
computer readable medium. The sequence information can be represented in a 
word processing text file, formatted in commercially-available software such as 
WordPerfect and MicroSoft Word, or represented in the form of an ASCII file, 
stored in a database application, such as DB2, Sybase, Oracle, or the like. A 
skilled artisan can readily adapt any number of dataprocessor structuring formats 
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(e.g. text file or database) in order to obtain computer readable medium having 
recorded thereon the nucleotide sequence information of the present invention. 

By providing the nucleotide sequence of SEQ ID NO:l, 2, or 3, a 
representative fragment thereof, or a nucleotide sequence at least 99.9% identical 
to SEQ ID NO:l, 2, or 3, in computer readable form, a skilled artisan can 
routinely access the sequence information for a variety of purposes. Computer 
software is publicly available which allows a skilled artisan to access sequence 
information provided in a computer readable medium. The examples which 
follow demonstrate how software which implements the BLAST (Altschul et al , 
J. Mol Biol 275:403-410 (1990)) and BLAZE (Brutlag et ai, Comp, Chem, 
77:203-207 (1993)) search algorithms on a Sybase system can be used to identify 
open reading fi*ames (ORFs) within the M. jannaschii genome that contain 
homology to ORFs or proteins from other organisms. Such ORFs are protein- 
encoding firagments within the M. jannaschii genome and are usefiil in producing 
commercially important proteins such as enzymes used in methanogenesis, amino 
acid biosynthesis, metabolism, fermentation, transcription, translation, RNA 
processing, nucleic acid and protein degradation, protein modification, and DNA 
replication, restriction, modification, recombination, and repair. A 
comprehensive list of ORFs encoding commercially important M jannaschii 
proteins is provided in Tables 2(a) and 3. 

The present invention further provides systems, particularly computer- 
based systems, which contain the sequence information described herein. Such 
systems eire designed to identify commercially important fragments of the M 
jannaschii genome. As used herein, "a computer-based system" refers to the 
hardware means, software means, and data storage means used to analyze the 
nucleotide sequence information of the present invention. The minimum 
hardware means of the computer-based systems of the present invention 
comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the 
currently available computer-based system are suitable for use in the present 
invention. 
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As indicated above, the computer-based systems of the present invention 
comprise a data storage means having stored therein a nucleotide sequence of the 
present invention and the necessary hardware means and software means for 
supporting and implementing a search means. As used herein, "data storage 
means" refers to memory that can store nucleotide sequence information of the 
present invention, or a memory access means which can access manufactures 
having recorded thereon the nucleotide sequence information of the present 
invention. As used herein, "search means" refers to one or more programs which 
are implemented on the computer-based system to compare a target sequence or 
target structural motif with the sequence information stored within the data 
storage means. Search means are used to identify fragments or regions of the M. 
jannaschii genome that match a particular target sequence or target motif A 
variety of known algorithms are disclosed pubHcly and a variety of commercially 
available software for conducting search means are available and can be used in 
the computer-based systems of the present invention. Examples of such software 
include, but are not limited to, MacPattem (EMBL), BLASTN and BLASTX 
(NCBIA). A skilled artisan can readily recognize that any one of the available 
algorithms or implementing software packages for conducting homology searches 
can be adapted for use in the present computer-based systems. 

As used herein, a "target sequence" can be any DNA or amino acid 
sequence of six or more nucleotides or two or more amino acids. A skilled 
artisan can readily recognize that the longer a target sequence is, the less likely 
a target sequence will be present as a random occurrence in the database. The 
most preferred sequence length of a target sequence is from about 10 to 100 
amino acids or from about 30 to 300 nucleotide residues. However, it is well 
recognized that during searches for commercially important fragments of the M 
jannaschii genome, such as sequence fragments involved in gene expression and 
protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the 
sequence(s) are chosen based on a three-dimensional configuration which is 
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formed upon the folding of the target motif. There are a variety of target motifs 
known in the art. Protein target motifs include, but are not limited to, enzymic 
active sites and signal sequences. Nucleic acid target motifs include, but bxc not 
limited to, promoter sequences, hairpin structures and inducible expression 
elements (protein binding sequences). 

Thus, the present invention further provides an input means for receiving 
a target sequence, a data storage means for storing the target sequence and the 
homologous M jannaschii sequence identified using a search means as described 
above, and an output means for outputting the identified homologous M 
jannaschii sequence. A variety of structural formats for the input and output 
means can be used to input and output information in the computer-based systems 
of the present invention. A preferred format for an output means ranks fragments 
of the M. jannaschii genome possessing varying degrees of homology to the 
target sequence or target motif. Such presentation provides a skilled artisan with 
a ranking of sequences which contain various amounts of the target sequence or 
target motif and identifies the degree of homology contained in the identified 
fragment. 

A variety of comparing means can be used to compare a target sequence 
or target motif with the data storage means to identify sequence fragments of the 
M Jannaschii genome. For example, implementing software which implement the 
BLAST and BLAZE algorithms (Altschul et al, J, MoL Biol, 275:403-410 
(1990)) can be used to identify open reading frames within the M jannaschii 
genome. A skilled artisan can readily recognize that any one of the publicly 
available homology search programs can be used as the sestrch means for the 
computer-based systems of the present invention. 

One application of this embodiment is provided in Figure 4. Figure 4 
provides a block diagram of a computer system 102 that can be used to 
implement the present invention. The computer system 102 includes a processor 
106 cormected to a bus 104. Also connected to the bus 104 are a main memory 
108 (preferably implemented as random access memory, RAM) and a variety of 
secondary storage devices 1 10, such as a hard drive 1 12 and a removable medium 
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storage device 1 14. The removable medium storage device 1 14 may represent, 
for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. 
A removable storage medium 116 (such as a floppy disk, a compact disk, a 
magnetic tape, etc.) containing control logic and/or data recorded therein may be 
inserted into the removable medium storage device 114. The computer system 
102 includes appropriate software for reading the control logic and/or the data 
from the removable medium storage device 1 14 once inserted in the removable 
medium storage device 114. 

A nucleotide sequence of the present invention may be stored in a well 
known manner in the main memory 108, any of the secondary storage devices 
110, and/or a removable storage medium 116. Software for accessing and 
processing the genomic sequence (such as search tools, comparing tools, etc.) 
reside in main memory 108 during execution. 

Having generally described the invention, the same will be more readily 
understood by reference to the following examples, which are provided by way 
of illustration and are not intended as limiting. 



Experimental 

Complete genome sequence of the methanogenic archaeon, 
Methanococcus jannaschii 

Example 1 

A whole genome random sequencing method (Fleischmann, R.D., et al., 
Science 269:496 (1995); Fraser, CM., et al.. Science 270:397 (1995)) was used 
to obtain the complete genome sequence for M. Jannaschii. A small insert 
plasmid library (2.5 Kbp average insert size) and a large insert lambda library (16 
Kbp average insert size) were used as substrates for sequencing. The lambda 
library was used to form a genome scaffold and to verify the orientation and 
integrity of the contigs formed from the assembly of sequences from the plasmid 
library. All clones were sequenced from both ends to aid in ordering of contigs 
during the sequence assembly process. The average length of sequencing reads 
was 481 bp. A total of 36,718 sequences were assembled by means of the TIGR 
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Assembler (Fleischmann, R.D., et aL, Science 269:496 (1995); Fraser, CM., e( 
aL, Science 270:397 (1995); Sutton G., et al. Genome ScL Tech, 1:9 (1995)). 
Sequence and physical gaps were closed using a combination of strategies 
(Fleischmann, R.D., et al„ Science 269:496 (1995); Fraser, CM., et aL, Science 
270:391 (1995)). The colinearity of the in vivo genome to the genome sequence 
was confirmed by comparing restriction fragments from six, rare cutter, 
restriction enzymes (Aat II, BamHI, Bgl II, Kpn I, Sma I, and Sst II) to those 
predicted from the sequence data. Additional confidence in the colinearity was 
provided by the genome scaffold produced by sequence pairs from 339 large- 
insert lambda clones, which covered 88% of the main chromosome. Open 
reading frames (ORFs) and predicted protein-coding regions were identified as 
described (Fleischmann, R.D., et aL, Science 269:496 (1995); Fraser, CM., et 
al. Science 270:391 (1995)) with some modification. In particular, the statistical 
prediction of M jannaschii genes was performed with GeneMark (Borodovsky, 
M. & Mclninch, J. Comput, Chem. 77:123 (1993)). Regular GeneMark uses 
nonhomogeneous Markov models derived from a training set of coding sequences 
and ordinary Markov models derived from a training set of noncoding sequences. 
Only a single 16S ribosomal RNA sequence of M Jannaschii was available in the 
public sequence databases before the whole genome sequence described here. 
Thus, the initial treiining set to determine parameters of a coding sequence 
Markov model was chosen as a set of ORFs >1000 nucleotides (nt). As an initial 
model for non-coding sequences, a zero-order Markov model with genome- 
specific nucleotide frequencies was used. The initial models were used at the first 
prediction step. The results of the first prediction were then used to compile a set 
of putative genes used at the second training step. Alternate rounds of training 
and predicting were continued until the set of predicted genes stabilized and the 
parameters of the final fourth-order model of coding sequences were derived. 
The regions predicted as noncoding were then used as a training set for a final 
model for noncoding regions. Cross-validation simulations demonstrated that the 
GeneMark program trained as described above was able to correctly identify 
coding regions of at least 96 nt in 94% of the cases and noncoding regions of the 
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same length in 96% of the cases. These values assume that the self-training 
method produced correct sequence annotation for compiled control sets. 
Comparison with the results obtained by searches against a nonredundant protein 
database (Fleischmann, R.D., et al. Science 269:496 (1995); Fraser, CM., ei al, 
Science 270:397 (1995)) demonstrated that almost all genes identified by 
sequence similarity were predicted by the GeneMark program as well. This 
observation provides additional confidence in genes predicted by GeneMark 
whose protein translations did not show significant similarity to known protein 
sequences. The predicted protein-coding regions were search against the Blocks 
database (Henikoff, S. & Henikoff, J.G., Genomics J 9:97 (1994)] by means of 
BLIMPS (Wallace, J.C. & Henikoff, S., CABIOS 8:249 (1992)) to verify putative 
identifications and to identify potential functional motifs in predicted protein- 
coding regions that had no database match. Genes were assigned to known 
metabolic pathways. When a gene appeared to be missing from a pathway, the 
unassigned ORFs and the complete M jannaschii genome sequence were 
searched with specific query sequences or motifs from the Blocks database. 
Hydrophobicity plots were performed on all predicted protein-coding regions by 
means of the Kyte-Doolittle algorithm (Kyte, J. & Doolittle, R.F., J. Moi Biol. 
757:105 (1982)) to identify potentially functionally relevant signatures in these 
sequences. 

The M. jannaschii genome comprises three physically distinct elements: 
i) a large circular chromosome of 1,664,976 base pairs (bp) (SEQ ID NO:l), 
which contains 1682 predicted protein-coding regions and has a G+C content of 
31.4%; ii) a large circular extrachromosomal element (ECE) (Zhao, H., et aL, 
Arch Microbiol 750:178 (1988)) of 58,407 bp (SEQ ID NO:2), which contains 
44 predicted protein coding regions and has a G+C content of 28.2%; and iii) a 
small circular ECE (Zhao, H., et aL, Arch Microbiol 750:178 (1988)) of 16,550 
bp (SEQ ID NO:3), which contains 12 predicted protein coding regions, and has 
a G+C content of 28.8%. With respect to its shape, size, G+C content, and gene 
density the main chromosome resembles that ofK influenzae. However, here the 
resemblance stops. 
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Of the 1743 predicted protein-coding regions reported previously for H, 
influenzae, 78% had a match in the public sequence database (Fleischmann, R.D., 
et al. Science 269:496 (1995); Fraser, CM., et al. Science 270:391 (1995)). Of 
these, 58% were matches to genes with reasonably well defined function, while 
20% were matches to genes whose function was undefined. Similar observations 
were made for the M genitalium genome (Fleischmann, R.D., et aL^ Science 
269:496 (1995); Fraser, CM., ei al. Science 270:391 (1995)). Eighty-three 
percent of the predicted protein coding regions from M genitalium have a 
counterpart in the H. influenzae genome. In contrast, only 38% of the predicted 
protein-coding regions from M jannaschii match a gene in the database that 
could be assigned a putative cellular role with high confidence; 6% of the 
predicted protein-coding regions had matches to hypothetical proteins (Tables 2- 
3). Approximately 100 genes in M jannaschii had marginal similarity to genes 
or segments of genes from the public sequence databases and could not be 
assigned a putative cellular role with high confidence. Only 1 1 % of the predicted 
protein-coding regions from H. influenzae and 17% of the predicted protein 
coding regions from M genitalium matched a predicted protein coding region 
from M jannaschii. Clearly the M jannaschii genome, and undoubtedly, 
therefore, all archaeal genomes are remarkably unique, as the phylogenetic 
position of these organisms would suggest- 

Energy production in M jannaschii occurs via the reduction of CO2 with 
H2 to produce methane. Genes for all of the known enzymes and enzyme 
complexes associated with methanogenesis (DiMarco, A.A., et ah, Ann. Rev. 
Biochem. JP:355 (1990)) were identified in M jannaschii , the sequence and 
order of which are typical of methanogens. M jannaschii appears to use both 
and formate as substrates for methanogenesis, but lacks the genes to use methanol 
or acetate. The ability to fix nitrogen has been demonstrated in a number of 
methanogens (Belay, N., et al,. Nature 312:286 (1984)) and all of the genes 
necessary for this pathway have been identified in M jannaschii (Tables 2-3). 
In addition to its anabolic pathways, several scavenging molecules have been 
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identified in M jannaschii that probably play a role in importing small organic 
compounds, such as amino acids, from the environment (Tables 2-3). 

Three different pathways are known for the fixation of CO2 into organic 
carbon: the non-cyclic, reductive acetyl-coenzyme A-carbon monoxide 
dehydrogenase pathway (Ljungdahl-Wood pathway), the reductive trichloroacetic 
acid (TCA) cycle, and the Calvin cycle. Methanogens fix carbon by the 
Ljungdahl-Wood pathway (Wood, H.G., et aL, TIBS 77:14 (1986)), which is 
facilitated by the carbon monoxide dehydrogenease enzyme complex (CODH) 
(Blaat, M., Antonie van Leewenhoek 66: 1 87 ( 1 994)). The complete Ljungdahl- 
Wood pathway, encoded in the M. Jannaschii genome, depends on the methyl 
carbon in methanogenesis; however, methanogenesis can occur independently of 
carbon fixation. 

Although genes encoding two enzymes required for gluconeogenesis 
(glucopyruvate oxidoreductase and phosphoenolpyruvate synthase) were found 
in the M. jannaschii genome, genes encoding other key intermediates of 
gluconeogenesis (fructose bisphosphatase and fructose 1,6-bisphosphate aldolase) 
were not been identified. Glucose catabolism by glycolysis also requires the 
aldolase, as well as phosphofructokinase, an enzyme that also was not found in 
M Jannaschii and has not been detected in any of the Archaea. In addition, genes 
specific for the Entner-Doudoroff pathway, an alternative pathway used by some 
microbes for the catabolism of glucose, were not identified in the genomic 
sequence. The presence of a number of nearly complete metabolic pathways 
suggests that some key genes are not recognizable at the sequence level, although 
we cannot exclude the possibility that M Jannaschii may use alternative 
metabolic pathways. 

In general, M Jannaschii genes that encode proteins involved in the 
transport of small inorganic ions into the cell are homologs of bacterial genes. 
The genome includes many representatives of the ABC transporter family, as well 
as genes for exporting heavy metals (e.g., the chromate-resistance protein) and 
other toxic compounds (e.g., the norA drug efflux pump locus). 
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More than 20 predicted protein-coding regions have sequence similarity 
to polysaccharide biosynthetic enzymes. These genes have only bacterial 
homologs or are most closely related to their bacterial counterparts. The 
identified polysaccharide biosynthetic genes in M jannaschii include those for 
the interconversion of sugars, activation of sugars to nucleotide sugars, and 
glycosyltransferases for the polymerization of nucleotide sugars into oligo- and 
polysaccharides that are subsequently incorporated into surface structures 
(Hartmann, E. and Konig, H., Arch Microbiol, 151:274 (1989)). In an 
arrangement reminiscent of bacterial polysaccharide biosynthesis genes, many of 
the genes for M Jannaschii polysaccharide production are clustered together 
(Tables 2-3), The G-f C content in this region is <95% of that in the rest of the M 
jannaschii genome. A similar observation was made in Salmonella typhimurium 
(Jiang, X.M., et aL^ Mol Microbiol. 5:695 (1991)) in which the gene cluster for 
lipopolysaccharide O antigen hsis a significantly lower G+C ratio than the rest of 
the genome. In that case, the difference in G+C content was interpreted as 
meaning that the region originated by lateral transfer from another organism. 

Of the three main multicomponent information processing systems 
(transcription, translation, and replication), translation appears the most universal 
in its overall makeup in that the basic translation machinery is similar in all three 
domains of life. M. jannaschii has two ribosomal RNA operons, designated A 
and B, and a separate 5S RNA gene that is associated with several transfer RNAs 
(tRN As). Operon A has the organization, 1 6S - 23S - 5S, whereas operon B lacks 
the 5S component. An eilanine tRNA is situated in the spacer region between the 
16S and 23 S subimits in both operons. The majority of proteins associated with 
the ribosomal subimits (especially the small subunit) are present in both Bacteria 
and Eukaryotes, However, the relatively protein-rich eukaryotic ribosome 
contains additional ribosomal proteins not found in the bacterial ribosome. A 
smaller number of bacteria-specific ribosomal proteins exist as well. The M 
jannaschii genome contains £ill ribosomal proteins that are common to eukaryotes 
and bacteria. It shows no homologs of the bacterial-specific ribosomal proteins, 
but does possess homologs of a number of the eukaryotic-specific ones. 



wo 98/07830 PCT/US9 7/1 4900 

-41- 

Homologs of all archaea-specific ribosomal proteins that have been reported to 
date (Lechner, K., et al, J. Mol EvoL 29:20 (1989); Kopke, A.K.E. and 
Wittmann-Liebold, B., Can. J. Microbiol. 55:11 (1989)) are found in M. 
jannaschii. 

As previously shown for other archaea (Iwabe, N., et aL, Proc. Natl 
Acad Sci. USA 86:9355 (1989); Gogarten J,P,, et aL Proc, Natl Acad. Sci. USA 
55:6661 (1989); Brown, J.R. and Doolittle, Proc. Natl Acad. Sci. USA 

92:2441 (1995)), (he Methanococcus translation elongation factors EF-la (EF-Tu 
in bacteria) and EF-2 (EF-G in bacteria) are most similar to their eukaryotic 
counterparts. In addition, the M jannaschii genome contains 1 1 translation 
initiation factor genes. Three of these genes encode the subunits homologous to 
those of the eukaryotic IF-2, and are reported here in the Archaea for the first 
time. A fourth initiation factor gene that encodes a second IF-2 is also found in 
M. jannaschii. This additional IF-2 gene is most closely related to the yeast 
protein FUN 12 which, in turn, appears to be a homolog of the bacterial IF-2. It 
is not known which of the two IF-2-Iike initiation factors identified in M. 
jannaschii plays a role in directing the initiator tRNA to the start site of the 
mRNA. The fifth identified initiation factor gene in M. jannaschii encodes IF- 
lA, which has no bacterial homolog. The sixth gene encodes the hypusine- 
containing initiation factor eIF-5a. Two subunits of the translation initiation 
factor eIF-2B were identified in M jannaschii. Finally, three putative 
adenososine 5 '-triphosphate (ATP)-dependent helicases were identified that 
belong to the eIF-4a family of translation initiation factors. 

Thirty-seven tRNA genes were identified in the M. jannaschii genome. 
Almost all amino acids encoded by two codons have a single tRNA, except for 
glutamic acid, which has two. Both an initiator and an internal methionyl tRNA 
are present. The two pyrimidine-ending isoleucine codons are covered by a 
single tRNA, while the third (AUA) seems covered by a related tRNA having a 
CAU anticodon. A single tRNA appears to cover the three isoleucine codons. 
Those amino acids encoded by four codons each have two tRNAs, one to cover 
the Y-, the other the R-ending, codons. Valine has a third tRNA, which is 
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specific for the GUG codon; and alanine has three tRNAs (two of which are in the 
spacer regions separating the 16S and 23 S subunits in the two ribosomal RNA 
operons). Leucine, serine and arginine, all of which have six codons, each posses 
three corresponding tRNAs. The genes for the internal methionine and 
tryptophan tRNAs contain introns in the region of their ant i -codon loops. 

A tRNA also exists for selenocysteine (UGA codon). At least four genes 
in M jannaschii contain internal stop codons that are potential selenocysteine 
codons: the a chain of formate dehydrogenase, coenzyme F420 reducing 
hydrognase, P-chain tungsten formyl methanofuran dehydrogenase, and a 
heterodisulfide reductase. Three genes with a putative role in selenocysteine 
metabolism were identified by their similarity to the sel genes from other 
organisms (Tables 2-3), 

Recognizable homologs for four of the aminoacyl-tRNA synthetases 
(glutamine, asparagine, lysine, and cysteine) were not identified in the M, 
jannaschii genome. The absence of a glutaminyl-tRNA synthetase is not 
surprising in that a number of organisms, including at least one archaeon, have 
none (Wilcox, M., Eur, J, Biochem. 77:405 (1969); Martin, N.C., et aL, J, Moi 
Biol. 707:285 (1976); Martin, N.C., et aL, Biochemistry 16:4672 (1977); Schon, 
A., etaU Biochimie 70:391 (1988); Soli, D. and RajBhandary, U., Eds. Am. Soc. 
for Microbiol. (1995)). In these instances, glutaminyl tRNA charging involves 
a post-charging conversion mechanism whereby the tRNA is charged by the 
glutamyl-tRNA synthetase with glutamic acid, which then is enzymatically 
converted to glutamine. A post-charging conversion is also involved in 
selenocysteine charging via the seryl-tRN A synthetase. A similar mechanism has 
been proposed for asparagine charging, but has never been demonstrated (Wilcox, 
M., Eur J. Biochem. 77:405 (1969); Martin, N.C.. etal.^J. Mol Biol, 707:285 
(1976); Martin, N.C., et aL, Biochemistry 76:4672 (1977); Schon, A„ et a/., 
Biochimie 70:391 (1988); Soil, D. and RajBhandary, U., Eds. Am. Soc. for 
Microbiol. (1995)). The inability to find homologs of the lysine and cysteine 
aminoacyl-tRNA synthetases is surprising because bacterial and eukaryotic 
versions in each instance show clear homology. 
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Aminoacyl-tRNA synthetases of M. jannaschii and other archaea 
resemble eukaryotic synthetases more closely than they resemble bacterial forms. 
The tryptophanyl synthetase is one of the more notable examples, because the M 
jannaschii and eukaryotic version do not appear to be specifically related to the 
bacterial version (de Pouplana, R., et al., Proc. Natl. Acad Sci., USA Pi: 166 
(1996)). Two versions of the glycyl synthetase are known in bacteria, one that 
is very unlike the version found in Archaea and Eukaryote and one that is an 
obvious homolog of it (Wagner, E.A., et al, J. Bacterial. 177:5\19 (1995); 
Logan, D.T., et al., EMBOJ. J4:4156 (1995)). 

Eleven genes encoding subunits of the DNA-dependent RNA polymerase 
were identified in the M Jannaschii genome. The sequence similarity between 
the subunits and their homologs in Sulfolobus acidocaldarius supports the 
evolutionary unity of the archaeal polymerase complex (Woese, C.R. and Wolfe, 
R.S., Eds. The Bacteria, vol. FZ// (Academic Press, NY, 1985); Langer, D., et al, 
Proc. Natl Acad Sci. P2:5768 (1995); Lanzendoerfer, M. et al. System. AppL 
Microbiol 16:656 (1994)). All of the subunits found in M. jannaschii show 
greater similarity to their eukaryotic counterparts than to the bacterial homologs. 
The genes encoding the five largest subunits (A', A", B', B", D) have homologs 
in all organisms. Six genes encode subunits shared only by Archaea and 
Eukaryotes (E, H, K, L, and N). The M. jannaschii homolog of the S 
acidocaldarius subunit E is split into two genes designated E' and E". 
Sulfolobus acidocaldarius also contains two additional small subunits of RNA 
polymerase, designated G and F, that have no counterparts in either Bacteria or 
Eukaryotes. No homolog of these subunits was identified in M jannaschii 

The archaeal transcription initiation system is essentially the same as that 
found in Eukaryotes, and is radically different from the bacterial version (Klenk, 
H.P. and Doolittle, W.F., Curr. Biol 4:920 (1994)). The central molecules in the 
former systems are the TATA-binding protein (TBP) and transcription factor B 
(TFIIB and TFIIIB in Eukaryotes, or simply TFB). In the eukaryotic systems, 
TBP and TFB are parts of larger complexes, and additional factors (such as 
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TFIIA and TFIIF) are used in the transcription process. However, the M. 
jannaschii genome does not contain obvious homologs of TFIIA and TFIIF. 

Several components of the replication machinery were identified in M 
jannaschii. The M jannaschii genome appears to encode a single DNA- 
dependent polymerase that is a member of the B family of polymerases (Bernard, 
A., et a/., EMBOJ, 6:4219 (1987); Cullman, G., et aL, Molec. Cell BioL 75:4661 
(1995); Uemori, T., et aL, J, BacterioL 777:2164 (1995); Delarue, M., et a/., 
ProL Engineer, 3:461 (1990); Gavia K.A., et al„ Science 270:1667 (1995)). The 
polymerase shares sequence similarity and three motifs with other family B 
polymerases, including eukaryotic a, y, and € polymerases, bacterial polymerase 
II, and several archaeal polymerases. However, it is not homologous to bacterial 
polymerase I and has no homologs in K influenzae or M genitalium. 

Primer recognition by the polymerase tEikes place through a stmcture- 
specific DNA binding complex, the replication factor complex (rfc) (Bemard, A., 
et aL, EMBOJ, 6:4219 (1987); Cullman, G., et aL, Molec. Cell BioL 75:4661 
(1995); Uemori, T., et aL, J, BacterioL 777:2164 (1995); Delarue, M., et aL, 
Prot. Engineer, 3:461 (1990); Gavin, K,A., et aL, Science 270:1667 (1995)). In 
humans and yeast, the rfc is composed of five proteins: a large subunit and four 
small subunits that have an associated adenosine triphosphatase (ATPase) activity 
stimulated by proliferating cell nuclear antigen (PCNA). Two genes in M 
jannaschii are putative members of a eukary otic-like replication factor complex. 
One of the genes in M jannaschii is a putative homolog of the large subunit of 
the rfc, whereas the second is a putative homolog of one of the small subunits. 
Among Eukaryotes, the rfc proteins share sequence similarity in eight signature 
domains (Bemard, A., et aL, EMBOJ, 6:4219 (1987); Cullman, G., et aL, Molec 
Cell BioL 75:4661 (1995); Uemori, T., et aL, J. BacterioL 777:2164 (1995); 
Delarue, M., et aL, Prot, Engineer, 5:461 (1990); Gavin, K.A., et aL, Science 
270:1667 (1995)). Domain I is conserved only in the large subunit among 
Eukaryotes and is similar in sequence to DNA ligases. This domain is missing 
in the large-subunit homolog in M. jannaschii. The remaining domains in the two 
M jannaschii genes are well-conserved relative to the eukaryotic homologs. Two 
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features of the sequence similarity in these domains are of particular interest. 
First, domain II (an ATPase domain) of the small-subunit homolog is split 
between two highly conserved amino acids (lysine and threonine) by an 
intervening sequence of unknown function. Second, the sequence of domain VI 
has regions that are useful for distinguishing between bacterial and eukaiyotic rfc 
proteins (Bernard, A., et al., EMBOJ. 5:4219 (1987); Cullman, G., et al., Molec. 
Cell Biol. 75:4661 (1995); Uemori, T., et al., J. Bacterial. J J 7:2164 (1995); 
Delarue, M., et al., Prot. Engineer. 5:461 (1990); Gavin K.A., et al. Science 
270:1667 (1995)); the rfc sequence for M jannaschii shares the characteristic 
eukaryotic signature in this domain. 

We have attempted to identify an origin of replication by searching the M 
Jannaschii genome sequence with a variety of bacterial and eukaryotic 
replication-origin consensus sequences. Searches with oriC, ColEl, and 
autonomously replicating sequences from yeast (Bernard, A., et al, EMBO J. 
<5:4219 (1987); Cullman, G., et al, Molec. Cell Biol. 75:4661 (1995); Uemori, T., 
et al., J. Bacteriol. 11 7:2164 (1995); Delarue, M., et al, Prot. Engineer. 3:461 
(1990); Gavin, K.A., et al., Science 270:1667 (1995)) did not identify an origin 
of replication. With respect to the related cellular processes of replication 
initiation and cell division, the M. jannaschii genome contains two genes that are 
putative homologs of Cdc54, a yeast protein that belongs to a family of putative 
DNA replication initiation proteins (Whitbred, L.A. and Dalton, S., Gene 155: 1 1 3 
(1995)). A third potential regulator of cell division in M. jannaschii is 55% 
similar at the amino acid level to pelota, a Drosophila protein involved in the 
regulation of the early phases of meiotic and mitotic cell division (Eberhart, C.G. 
and Wasserman, S.A., Development 121:3477 (1995)). 

In contrast to the putative rfc complex and the initiation of DNA 
replication, the cell division proteins from M jannaschii most resemble their 
bacterial counterparts (Rothfield, L.I. and Zhao, C.R., Cell 84:IS3 (1996); 
Lutkenhaus, J., Curr. Opp Gen. Devel. J:783 (1993)). Two genes similar to that 
encoding FtsZ, a ubiquitous bacterial protein, are found in M. jannaschii. FtsZ 
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is a polymer-forming, guanosine triphosphate (GTP)-hydrolyzing protein with 
tubulin-like elements; it is localized to the site of septation and forms a 
constricting ring between the dividing cells. One gene similar to FtsJ, a bacterial 
cell division protein of undetermined function, also is found in M, jannaschii. 
Three additional genes (MinC, MinD, and MinE) function in concert in Bacteria 
to determine the site of septation during cell division. In M jannaschii, three 
MinD-like genes were identified, but none for MinC or MinE, Neither spindle- 
associated proteins characteristic of eukaryotic cell division nor bacterial 
mechanochemical enzymes necessary for partitioning the condensed 
chromosomes were detected in the M, jannaschii genome. Taken together, these 
observations raise the possibility that cell division in M jannaschii might occur 
via a mechanism specific for the Archaea. 

The structural and functional conservation of the signal peptide of 
secreted proteins in Archaea, Bacteria, and Eukaryotes suggests that the basic 
mechanisms of membrane targeting and translocation may be similar among all 
three domains of life. The secretory machinery of M jannaschii appears a 
rudimentary apparatus relative to that of bacterial and eukaryotic systems and 
consists of (i) a signal peptidase (SP) that cleaves the signal peptide of 
translocating proteins, (ii) a preprotein translocase that is the major constituent 
of the membrane-localized translocation charmel, (iii) a ribonucleoprotein 
complex (signal recognition particle, SRP) that binds to the signal peptide and 
guides nascent proteins to the cell membrane, and (iv) a docking protein that acts 
as a receptor for the SRP, The 7S RNA component of the SRP from M 
jannaschii shows a highly conserved structural domain shared by other Archaea, 
Bacteria, and Eukaryotes (Kaine, B.P. and Merkel, V.L., J, BacterioL 777:4261 
(1989); Poritz, M.A. et al. Cell 55 A (1988)). However, the predicted secondary 
structure of the 7S RNA SRP component in Archaea is more like that found in 
Eukaryotes than in Bacteria (Kaine, B.P. and Merkel, V.L., J. BacterioL 1 71 :4261 
(1989); Poritz, M.A. et al, Cell 55 A (1988)). The SP and docking proteins from 
M jannaschii are most similar to their eukaryotic counterparts; the translocase is 
most similar to the SecY translocation-associated protein in Escherichia coli. 
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A second distinct signal peptide is found in the flagellin genes of M, 
jannaschii. Alignment of flagellin genes from M voltae (Faguy, D.M., et aL, 
Can. J. Microbiol 40:61 (1994); Kalmokoff, M.L,, ei al.. Arch. Microbiol 
757:48 1 (1 992)) and M jannaschii reveals a highly conserved NHj-terminus (3 1 
of the first 50 residues are identical in all of the mature flagellins). The peptide 
sequence of the M jannaschii flagellin indicates that the protein is cleaved after 
the canonical Gly-12 position, and it is proposed to be similar to type-IV pilins 
of Bacteria (Faguy, D.M., et al. Can, J. Microbiol, 40:61 (1994); Kalmokoff, 
M,L,,etal., Arch, Microbiol, 757:481 (1992)). 

Five histone genes are present in the M. jannaschii genome-three on the 
main chromosome and two on the large ECE. These genes are homologs of 
eukaryotic histones (H2a, H2b, H3, and H4) and of the eukaryotic transcription- 
related CAAT-binding factor CBF-A (Sandman, K., et al, Proc. Natl Acad. Sci. 
USA 57:5788 (1990)). The similarity between archaeal and eukaryotic histones 
suggests that the two groups of organisms resemble one another in the roles 
histones play both in genome supercoiling dynamics and in gene expression. 
The five M, jannaschii histone genes show greatest similarity among themselves 
even though a histone sequence is available from the closely related species, 
Methanococcus voltae. This intraspecific similarity suggests that the gene 
duplications that produced the five histone genes occurred on the M. jannaschii 
lineage per se. 

Self-splicing portions of a peptide sequence that generally encode a DNA 
endonuclease activity are called inteins, in analogy to introns (Kane, P.M., et aL, 
Science 250:651 (1990); Hirata, R., et al, J. Biol Chem. 265:6126 (1990); 
Cooper, A. and Stevens, T„ TIBS 20:351 (1995); Xu, M.Q., etaL, Cell 75:1311 
(1993); Perler et al, Proc. Natl Acad Sci. USA 89:5511 (1992); Cooper et al, 
EMBOJ. 12:2515 (1993); Michel et al, Biochimie 64:S61 (1992); Pietrokovski 
S., Prot. ScL i:2340 (1994), Most inteins in the M jannaschii genome were 
identified by (i) similarity of the bounding exteins to other proteins, (ii) similarity 
of the inteins to those previously described, (iii) presence of the dodecapeptide 
endonuclease motifs, and (iv) canonical intein-extein junction sequences. In two 
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instances (MJ0832 and MJ0043), the similarity to other database sequences did 
not unambiguously define the NH2-terminal extein-intein junction, so it was 
necessary to rely on consensus sequences to select the putative site. The inteins 
in MJ1042 and MJ0542 have previously uncharacterized COOH-terminal splice 
junctions, GNC and FNC, respectively). 

The sequences remaining after an intein is excised are called exteins, in 
analogy to exons. Exteins are spliced together after the excision of one or more 
inteins to form fiinctional proteins. The biological significance and role of inteins 
are not clearly understood (Kane, P.M., et al.. Science 250:651 (1990); Hirata, R., 
et al, J. BioL Chem. 265:6126 (1 990); Cooper, A. and Stevens, T., TIBS 20:35 1 
(1995); Xu, M.Q., et aL, Cell 75:1371 (1993); Perler et aL, Proc Natl Acad. ScL 
USA 89:5511 (1992); Cooper et al, EMBO J. 72:2575 (1993); Michel et aL, 
Biochimie 64:S61 (1992); Pietrokovski S., Prot, Sci. i:2340 (1994)). Fourteen 
genes in the M jannaschii genome contain 1 8 putative inteins, a significant 
increase in the approximately 1 0 intein-containing genes that have been described 
(Kane, P.M., et a/., Science 250:651 (1990); Hirata, R., et al, J, Biol Chem. 
265:6126 (1990); Cooper, A. and Stevens, T., TIBS 20:351 (1995); Xu, M.Q., et 
al. Cell 75:1371 (1993); Perler e/ a/., Proc, Natl Acad ScL USA 89:5511 (1992); 
Cooper et al, EMBO J, 12:2515 (1993); Michel et aL, Biochimie 64:S61 (1992); 
Pietrokovski S., Prot. Sci, i:2340 (1994)) (Table 4). The only previously 
described inteins in the Archaea are in the DNA polymerase genes of the 
Thermococcales (Kane, P.M., etaL, Science 250:651 (1990); Hirata, R., et al, J. 
Biol Chem, 265:6126 (1990); Cooper, A. and Stevens, T., TIBS 20:351 (1995); 
Xu, M.Q., et aL, Cell 75:1371 (1993); Perler et al, Proc, Natl Acad. Sci. USA 
89:5511 (1992); Cooper a/., EMBO J, 12:2515 (1993); Michel et al, Biochimie 
64:S61 (1992); Pietrokovski S., Prot. Sci 5:2340 (1994)). The M. jannaschii 
DNA polymerase gene has two inteins in the same locations as those in 
Pyrococcus sp. strain KODl . In this case, the exteins exhibit 46% amino acid 
identity, whereas intein 2 of the two organisms has only 33% identity. This 
divergence suggests that intein 2 has not been recently (laterally) transferred 
between the Thermococcales and M. jannaschii. In contrast, the intein 1 
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sequences are 56% identical, more than that of the gene containing them, and 
comparable to the divergence of inteins within the Thermococcales. This high 
degree of sequence similarity might be the result of an intein transfer more recent 
than the splitting of these species. The large number of inteins found in M 
jannaschii led us to question whether these inteins have been increasing in 
number by moving within the genome. If this were so, we would expect to find 
some pairs of inteins that are particularly similar. Comparisons of these and other 
available intein sequences showed that the closest relationships are those noted 
above linking the DNA polymerase inteins to correspondingly positioned 
elements in the Thermococcales. Within M jannaschii the highest identity 
observed was 33% for a 380-bp portion of two inteins. This finding suggests that 
the diversification of the inteins predates the divergence of the M. jannaschii and 
Pyrococcus DNA polymerases. 

Three families of repeated genetic elements were identified in the M 
jannaschii genome. Within two of the families, at least two members were 
identified as ORFs with a limited degree of sequence similarity to bacterial 
transposases. Members of the first family, designated ISAMJl, are repeated 10 
times on the main chromosome and once on the large ECE (Fig. 2), There is no 
sequence similarity between the IS elements in M jannaschii and the ISM7 
mobile element described previously for Methanobrevibacter smithii (Hamilton, 
P.T. et aL Mol Gen. Genet, 200:47 (1985)). Two members of this family were 
identified as ORFs and are 27% identical (at the amino acid sequence level) to a 
transposase from Bacillus thuringiensis (IS240; GenBank accession number 
M23741). Relative to these two members, the remaining members of the ISAMJI 
family are missing an internal region of several hundred nucleotides (Fig. 2). 
With one exception, all members of this family end with 16-bp terminal inverted 
repeats typical of insertion sequences. One member is missing the terminal 
repeat at its 5' end. The second family consists of two ORFs that are identical 
across 928 bp. The ORFs are 23% identical at the amino acid sequence level to 
the COOH-terminus of a transposase from Lactococcus lactis (ISP52; GenBank 
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accession number L34754). Neither of the members of the second family 
contains terminal inverted repeats. 

Eighteen copies of the third family of repeated genetic structures (Fig. 3) 
are distributed fairly evenly around the M jannaschii genome. Unlike the genetic 
elements described above, none of the components of this repeat unit appears to 
have coding potential. The repeat structure is composed of a long segment 
followed by one to 25 tandem repetitions of a short segment. The short segments 
are separated by sequence that is unique within and among the complete repeat 
structure. Three similar types of short segments were identified; however, the 
type of short repeat is consistent within each repeat structure, except for variation 
of the last short segment in six repeat structures. Similar tandem repeats of short 
segments have been observed in Bacteria and other Archaea (Mojica, F.J.M., et 
aL, MoL Micro. 77:85 (1995)) and have been hypothesized to participate in 
chromosome partitioning during cell division. 

The 16-kbp ECE from M jannaschii contains 12 ORPs, none of which 
had a significant full-length match to any published sequence. The 58-kbp ECE 
contains 44 predicted protein-coding regions, 5 of which had matches to genes 
in the database. Two of the genes are putative archaeal histones, one is a 
sporulation-related protein (SOJ protein), and two are type I restriction 
modification enzymes. There are several instances in which predicted protein- 
coding regions or repeated genetic elements on the large ECE have similar 
counterparts on the main chromosome of M jannaschii. The degree of nucleotide 
sequence similarity between genes present on both the ECE and the main 
chromosome ranges from 70 to 90%, suggesting that there has been relatively 
recent exchange of at least some genetic material between the large ECE and the 
main chromosome. 

All the predicted protein-coding regions from M jannaschii were 
searched against each other in order to identify families of paralogous genes 
(genes related by gene duplication, not speciation). The initial criterion for 
grouping paralogs was >30% amino acid sequence identity over 50 consecutive 
amino acid residues. Groups of predicted protein-coding regions were then 
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aligned and inspected individually to ensure that the sequence similarity extended 
over most of their lengths. This curatorial process resulted in the identification 
of more than 100 gene families, half of v^hich have no database matches. The 
largest identified gene family (16 members: MJ0625, MJECL28, MJ1076, 
MJI006, MJ1659, MJ0075, MJ1609, MJECL19, MJECL18, MJ0147, MJ0801, 
MJ130U MJ0632, MJlOlO, MJ0074, and MJ0439) contains almost 1% of the 
total predicted protein-coding regions in M. jannaschii. 

Despite the availability for comparison of two complete bacterial genomes 
and several hundred megabase pairs of eukaiyotic sequence data, the majority of 
genes in M. jannaschii cannot be identified on the basis of sequence similarity. 
Previous evidence for the shared conmion ancestry of the Archaeal and 
Eukaryotic was based on a small set gene sequences (Iwabe, N., et aL, Proc. Natl 
Acad, Sci, USA 55:9355 (1989); Gogarten J.P., etai, Proc, Natl. Acad. Sci, USA 
86:6661 (1989); Brovra, J.R. and Doolittle, W.F., Proc. Natl. Acad Sci, USA 
92:2441 (1995)). The complete genome of M Jannaschii allows us to move 
beyond a "gene by gene" approach to one that encompasses the larger picture of 
metabolic capacity and cellular systems. The anabolic genes of M jannaschii 
(especially those related to energy production and nitrogen fixation) reveal an 
ancient metabolic world shared largely by Bacteria and Archaea, That many 
basic autotrophic pathways appear to have a common evolutionary origin 
suggests that the most recent universal common ancestor to all three domains of 
extant life had the capacity for autotrophy. The Archaea and Bacteria also share 
structural and organizational features that the most recent universal prokaryotic 
ancestors also likely possessed, such as circular genomes and genes organized as 
operons. In contrast, the cellular information-processing and secretion systems 
in M Jannaschii demonstrate the common ancestry of Eukaryotes and Archaea. 
Although there are components of these systems are present in all three domains, 
their apparent refinement over time-especially transcription and 
translation-indicate that the Archaea and Eukaryotes share a common 
evolutionary trajectory independent of the lineage of Bacteria. 
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Example 2 

Preparation of PCR Primers and Amplification of DNA 

Various fragments of the Methanococcus jannaschii genome, such as 
those disclosed in Tables 2(a), 2(b) and 3 can be used, in accordance with the 
5 present invention, to prepare PCR primers. The PCR primers are preferably at 

least 1 5 bases, and more preferably at least 1 8 bases in length. When selecting 
a primer sequence, it is preferred that the primer pairs have approximately the 
same G/C ratio, so that melting temperatures are approximately the same. The 
PCR primers are useful during PCR cloning of the ORFs described herein. 

10 Examples 

Gene expression from DNA Sequences Corresponding to ORFs 

A fragment of the Methanococcus jannaschii genome (preferably , a 
protein-encoding sequence) provided in Tables 2(a), 2(b) or 3 is introduced into 
an expression vector using conventional technology (techniques to transfer cloned 

15 sequences into expression vectors that direct protein translation in mammalian, 

yeast, insect or bacterial expression systems are well known in the art). 
Commercially available vectors and expression systems are available from a 
variety of suppliers including Stratagene (La JoUa, California), Promega 
(Madison, Wisconsin), and Invitrogen (San Diego, California). If desired, to 

20 enhance expression and facilitate proper protein folding, the codon context and 

codon pairing of the sequence may be optimized for the particular expression 
organism, as explained by Hatfield et al., U.S. Pat, No. 5,082,767, which is 
hereby incorporated by reference. 

The following is provided as one exemplary method to generate 

25 polypeptide(s) from a cloned ORF of the Methanococcus genome whose 

sequence is provided in SEQ ID NOS: 1, 2 and 3. A poly A sequence can be 
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added to the construct by, for example, splicing out the poly A sequence from 
pSG5 (Stratagene) using Bgll and Sail restriction endonuclease enzymes and 
incorporating it into the mammalian expression vector pXTl (Stratagene) for use 
in eukaryotic expression systems. pXTl contains the LTRs and a portion of the 
gag gene from Moloney Murine Leukemia Virus. The position of the LTRs in 
the construct allow efficient stable transfection. The vector includes the Herpes 
Simplex thymidine kinase promoter and the selectable neomycin gene. The 
Methanococcus DNA is obtained by PGR from the bacterial vector using 
oligonucleotide primers complementary to the Methanococcus DNA and 
containing restriction endonuclease sequences for Pstl incorporated into the 5' 
primer and Bglil at the 5' end of the corresponding Methanococcus DNA 3' 
primer, taking care to ensure that the Methanococcus DNA is positioned such that 
its followed with the poly A sequence. The purified fragment obtained from the 
resulting PGR reaction is digested with Pstl, blunt ended with an exonuclease, 
digested with Bglll, purified and ligated to pXTl, now containing a poly A 
sequence and digested BgHl. 

The ligated product is transfected into mouse NIH 3T3 cells using 
Lipofectin (Life Technologies, Inc., Grand Island, New York) under conditions 
outlined in the product specification. Positive transfectants are selected after 
growing the transfected cells in 600 ug/ml G418 (Sigma, St. Louis, Missouri). 
The protein is preferably released into the supernatant. However if the protein 
has membrane binding domains, the protein may additionally be retained within 
the cell or expression may be restricted to the cell surface. 

Since it may be necessary to purify and locate the transfected product, 
synthetic 15-mer peptides synthesized from the predicted Methanococcus DNA 
sequence are injected into mice to generate antibody to the polypeptide encoded 
by the Methanococcus DNA. 

If antibody production is not possible, the Methanococcus DNA sequence 
is additionally incorporated into eukaryotic expression vectors and expressed as 
a chimeric with, for example, fi-globin. Antibody to 6-globin is used to purify the 
chimeric. Corresponding protease cleavage sites engineered between the fl-globin 
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gene and the Methanococcus DNA are then used to separate the two polypeptide 
fragments from one another after translation. One useful expression vector for 
generating fi-globin chimerics is pSG5 (Stratagene). This vector encodes rabbit 
B-globin. Intron II of the rabbit B-globin gene facilitates splicing of the expressed 
transcript, and the polyadenylation signal incorporated into the construct increases 
the level of expression. These techniques as described are well known to those 
skilled intheart of molecular biology. Standard methods are available from the 
technical assistance representatives from Stratagene, Life Technologies, Inc., or 
Promega. Polypeptides may additionally be produced from either construct using 
in vitro translation systems such as In vitro Express™ Translation Kit 
(Stratagene). 

Example 4 

E. coli Expression of a M.jannaschii ORF and protein purification 

A MJannaschii ORF described in Table 2(a), 2(b), or 3 is selected and 
amplified using PGR oligonucleotide primers designed from the nucleotide 
sequences flanking the selected ORF and/or from portions of the ORF's NHj- or 
COOH-terminus. Additional nucleotides containing restriction sites to facilitate 
cloning are added to the 5' and 3' sequences, respectively. 

The restriction sites are selected to be convenient to restriction sites in the 
bacterial expression vector pDIO (pQE9), which is used for bacterial expression. 
(Qiagen, Inc. 9259 Eton Avenue, Chatsworth, CA, 9131 1). [pD10]pQE9 encodes 
ampicillin antibiotic resistance ("Amp"') and contains a bacterial origin of 
replication ("ori"), an IPTG inducible promoter, a ribosome binding site ("RBS"), 
a 6-His tag and restriction enzyme sites. 

The amplified M jannaschii DNA and the vector pQE9 both are digested 
with Sail and Xbal and the digested DNAs are then ligated together. Insertion of 
the M jannaschii DNA into the restricted pQE9 vector places the M jannaschii 
coding region downstream of and operably linked to the vector's IPTG-inducible 
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promoter and in-frame with an initiating AUG appropriately positioned for 
translation of the M. jannaschii protein. 

The ligation mixture is transformed into competent E. coli cells using 
standard procedures. Such procedures are described in Sambrook et al. 
Molecular Cloning: a Laboratory Manual, 2nd Ed.; Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, N.Y. (1989). E. coli strain M15/rep4, 
containing multiple copies of the plasmid pREP4, which expresses lac repressor 
and confers kanamycin resistance ("Kan"'), is used in carrying out the illustrative 
example described herein. This strain, which is only one of many that are 
suitable for expressing M. jannaschii protein, is available commercially from 
Qiagen. 

Transformants are identified by their ability to grow on LB plates in the 
presence of ampicillin and kanamycin. Plasmid DNA is isolated from resistant 
colonies and the identity of the cloned DNA confirmed by restriction analysis. 
Clones containing the desired constructs are grown overnight ("O/N") in liquid 
culture in LB media supplemented with both ampicillin (100 ng/ml) and 
kanamycin (25 ^g/ml). 

The O/N culture is used to inoculate a large culture, at a dilution of 
approximately 1 : 1 00 to 1 :250. The cells are grown to an optical density at 600nm 
("OD600") of between 0.4 and 0.6. Isopropyl-B-D-thiogalactopyranoside 
("IPTG") is then added to a final concentration of 1 mM to induce transcription 
from lac repressor sensitive promoters, by inactivating the lad repressor. Cells 
subsequently are incubated further for 3 to 4 hours. Cells then are harvested by 
centrifugation and disrupted, by standard methods. Inclusion bodies are purified 
from the disrupted cells using routine collection techniques, and protein is 
solubilized from the inclusion bodies into 8M urea. The 8M urea solution 
containing the solubilized protein is passed over a PD-10 column in 2X 
phosphate-buffered saline ("PBS"), thereby removing the urea, exchanging the 
buffer and refolding the protein. The protein is purified by a fiirther step of 
chromatography to remove endotoxin followed by sterile filtration. The sterile 
filtered protein preparation is stored in 2X PBS at a concentration of 95 ^/ml. 
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Example 5 

Cloning and Expression of a M. jannaschii protein in a Baculovirus 
Expression System 

A M. jannaschii ORF described in Table 2(a), 2(b), or 3 is selected and 
amplified as above. The amplified DNA is isolated from a 1% agarose gel using 
a commercially available kit ("Geneclean," BIO 101 Inc., La JoUa, Ca.). The 
DNA then is digested with Xbal and again purified on a 1% agarose gel. This 
DNA is designated herein as F2. 

The vector pA2-GP is used to express the M jannaschii protein in the 
baculovirus expression system as described in Sunnmiers et ai, A Manual of 
Methods for Baculovirus Vectors and Insect Cell Culture Procedures, Texas 
Agricultural Experimental Station Bulletin No. 1555 (1987). The pA2-GP 
expression vector contains the strong polyhedrin promoter of the Autographa 
californica nuclear polyhedrosis virus (AcMNPV) followed by convenient 
restriction sites. The signal peptide of AcMNPV gp67, including the N-terminal 
methionine, is located just upstream of a BamHI site. The polyadenylation site 
from the simian virus 40 ("SV40") is used for efficient polyadenylation. For an 
easy selection of recombinant virus, the beta-galactosidase gene from £. coli is 
inserted in the same orientation as the polyhedrin promoter and is followed by the 
polyadenylation signal of the polyhedrin gene. The polyhedrin sequences are 
flanked at both sides by viral sequences for cell-mediated homologous 
recombination with wild-type viral DNA to generate viable virus that express the 
cloned polynucleotide. 

Many other baculovirus vectors could be used in place of pA2-GP, such 
as pAc373, pVL941 and pAcIMl provided, as those of skill readily will 
appreciate, that construction provides appropriately located signals for 
transcription, translation, trafficking and the like, such as an in- frame AUG and 
a signal peptide, as required. Such vectors are described in Luckow et al.. 
Virology 170; 31-39, among others. 
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The plasmid is digested with the restriction enzyme Xbal and then is 
dephosphorylated using calf intestinal phosphatase, using routine procedures 
known in the art. The DNA is then isolated from a 1% agarose gel using a 
commercially available kit ("Geneclean" BIO 101 Inc., La Jolla, Ca.). This 
vector DNA is designated herein "V", 

Fragment F2 and the dephosphorylated plasmid V2 are ligated together 
with T4 DNA ligase. E. coli HBlOl cells are transformed with ligation mix and 
spread on culture plates. Bacteria are identified that contain the plasmid with the 
M. jannaschii gene by digesting DNA from individual colonies using Xbal and 
then analyzing the digestion product by gel electrophoresis. The sequence of the 
cloned fragment is confirmed by DNA sequencing. This plasmid is designated 
herein pBacM jannaschii. 

5 fig of the plasmid pBacM jannaschii is co-transfected with 1 .0 ^g of a 
commercially available linearized baculovirus DNA ("BaculoGold*^*^ baculovirus 
DNA", Pharmingen, San Diego, CA.), using the lipofection method described by 
Feigner et al, Proc. Natl. Acad. Sci. USA 84: 7413-7417 (1987). l^ig of 
BaculoGold^*^ virus DNA and 5 ^ig of the plasmid pBacM jannaschii are mixed 
in a sterile well of a microtiter plate containing 50 |al of serum-free Grace's 
medium (Life Technologies Inc., Gaithersburg, MD). Afterwards 10 |al Lipofectin 
plus 90 |j.l Grace's medium are added, mixed and incubated for 15 minutes at 
room temperature. Then the transfection mixture is added drop-wise to Sf9 insect 
cells (ATCC CRL 1711) seeded in a 35 mm tissue culture plate with 1 ml Grace's 
medium without serum. The plate is rocked back and forth to mix the newly 
added solution. The plate is then incubated for 5 hours at 27°C. After 5 hours the 
transfection solution is removed from the plate and 1 ml of Grace's insect medium 
supplemented with 10% fetal calf serum is added. The plate is put back into an 
incubator and cultivation is continued at ll^'C for four days. 

After four days the supernatant is collected and a plaque assay is 
performed, as described by Sununers and Smith, cited above. An agarose gel 
with "Blue Gal" (Life Technologies Inc., Gaithersburg) is used to allow easy 
identification and isolation of gal-expressing clones, which produce blue-stained 
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plaques. (A detailed description of a "plaque assay" of this type can also be 
found in the user's guide for insect cell culture and baculovirology distributed by 
Life Technologies Inc., Gaithersburg, page 9-10). 

Four days after serial dilution, the virus is added to the cells. After 
5 appropriate incubation, blue stained plaques are picked with the tip of an 

Eppendorf pipette. The agar containing the recombinant viruses is then 
resuspended in an Eppendorf tube containing 200 )il of Grace's medium. The agar 
is removed by a brief centrifiigation and the supematant containing the 
recombinant baculovirus is used to infect Sf9 cells seeded in 35 mm dishes. Four 

10 days later the supematants of these culture dishes are harvested and then they are 

stored at 4°C. A clone containing properly inserted hESSB I, II and III is 
identified by DNA analysis including restriction mapping and sequencing. This 
is designated herein as V-M jannaschii, 

Sf9 cells are grown in Grace's medium supplemented with 10% heat- 

1 5 inactivated FBS. The cells are infected with the recombinant baculovirus V-M 

jannaschii at a multiplicity of infection ("MOI'*) of about 2 (about 1 to about 3). 
Six hours later the medium is removed and is replaced with SF900 II medium 
minus methionine and cysteine (available from Life Technologies Inc., 
Gaithersburg), 42 hours later, 5 |aCi of ^^S-methionine and 5 ^Ci ■'^S-cysteine 

20 (available from Amersham) are added. The cells are further incubated for 16 

hours and then they are harvested by centrifiigation, lysed and the labeled proteins 
are visualized by SDS-PAGE and autoradiography. 

Example 6 

Cloning and Expression in Mammalian Cells 

25 Most of the vectors used for the transient expression of a M. jannaschii 

gene in mammalian cells should carry the SV40 origin of replication. This allows 
the replication of the vector to high copy numbers in cells (e.g., COS cells) which 
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express the T antigen required for the initiation of viral DNA synthesis. Any 
other mammalian cell line can also be utilized for this purpose. 

A typical mammalian expression vector contains the promoter element, 
which mediates the initiation of transcription of mRNA, the protein-coding 
sequence, and signals required for the termination of trancription and 
polyadenylation of the transcript. Additional elements include enhancers, Kozak 
sequences and intervening sequences flanked by donor and acceptor sites for 
RNA splicing. Highly efficient transcription can be achieved with the early and 
late promoters from SV40, the long terminal repeats (LTRs) from Retroviruses, 
e.g., RSV, HTLVI, HIVI and the early promoter of the cytomegalovirus (CMV). 
However, cellular signals can also be used (e.g., human actin promoter). Suitable 
expression vectors for use in practicing the present invention include, for 
example, vectors such as pSVL and pMSG (Pharmacia, Uppsala, Sweden), 
pRSVcat (ATCC 37152), pSV2dhfr (ATCC 37146) and pBC12MI (ATCC 
67109). Mammalian host cells that could be used include, human Hela, 283, H9 
and Jurkart ceils, mouse NIH3T3 and C127 cells, Cos 1 , Cos 7 and CV 1 , African 
green monkey cells, quail QCl-3 cells, mouse L cells and Chinese hamster ovary 
cells. 

Alternatively, the gene can be expressed in stable cell lines that contain 
the gene integrated into a chromosome. The co-transfection with a selectable 
marker such as dhfr, gpt, neomycin, hygromycin allows the identification and 
isolation of the transfected cells. 

The transfected gene can also be amplified to express large amounts of the 
encoded protein. The DHFR (dihydrofolate reductase) is a usefiil marker to 
develop cell lines that cany several hundred or even several thousand copies of 
the gene of interest. Another useful selection marker is the enzyme glutamine 
syntiiase (GS) (Muiphy et aL, Biochem J. 227. 277-279 (1991); Bebbington et al, 
Bio/Technology 70.169-175 (1992)). Using these markers, the mammalian cells 
are grown in selective medium and the cells with the highest resistance are 
selected. These cell lines contain the amplified gene(s) integrated into a 
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chromosome, Chinese hamster ovary (CHO) cells are often used for the 
production of proteins. 

The expression vectors pCl and pC4 contain the strong promoter (LTR) 
of the Rous Sarcoma Virus (Cullen et al. Molecular and Cellular Biology, 
438-447 (March, 1985)) plus a fragment of the CMV-enhancer (Boshart et al, 
Cell 47:521-530 (1985)). Multiple cloning sites, e.g., with the restriction enzyme 
cleavage sites BamHI, Xbal and Asp718, facilitate the cloning of the gene of 
interest. The vectors contain in addition the 3' intron, the polyadenylation and 
termination signal of the rat preproinsulin gene. 

Example 6(a): Cloning and Expression in COS Cells 

The expression plasmid, pM jannaschii HA, is made by cloning a cDNA 
encoding a M jannaschii protein into the expression vector pcDN AI/Amp (which 
can be obtained from Invitrogen, Inc.). 

The expression vector pcDNAI/amp contains: (1) an £. coli origin of 
replication effective for propagation in E, coli and other prokaryotic cells; (2) an 
ampicillin resistance gene for selection of plasmid-containing prokaryotic cells; 
(3) an SV40 origin of replication for propagation in eukaryotic cells; (4) a CMV 
promoter, a polylinker, an SV40 intron, and a polyadenylation signal arranged so 
that a cDNA conveniently can be placed under expression control of the CMV 
promoter and operably linked to the SV40 intron and the polyadenylation signal 
by means of restriction sites in the polylinker. 

A DN A fragment encoding the M jannaschii protein and an HA tag fused 
in frame to its 3 ' end is cloned into the polylinker region of the vector so that 
recombinant protein expression is directed by the CMV promoter. The HA tag 
corresponds to an epitope derived from the influenza hemagglutinin protein 
described by Wilson et al, Cell 37:161 (1984). The fiision of the HA tag to the 
target protein allows easy detection of the recombinant protein with an antibody 
that recognizes the HA epitope. 
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The PGR amplified DNA fi-agment (generated as described above) and the 
vector, pcDNAI/Amp, are digested with Hindlll and Xhol and then ligated. The 
ligation mixture is transformed into E. coli strain SURE (available from 
Stratagene Cloning Systems, 11099 North Torrey Pines Road, La JoUa, CA 
92037), and the transformed culture is plated on ampicillin media plates which 
then are incubated to allow growth of ampicillin resistant colonies. Plasmid DNA 
is isolated from resistant colonies and examined by restriction analysis and gel 
sizing for the presence of the M. jannaschii protein-encoding fragment. 

For expression of recombinant M. Jannaschii, COS cells are transfected 
with an expression vector, as described above, using DEAE-DEXTRAN, as 
described, for instance, in Sambrook et ai. Molecular Cloning: a Laboratory 
Manual, Cold Spring Laboratory Press, Cold Spring Harbor, New York (1989). 
Cells are incubated under conditions for expression of M jannaschii protein by 
the vector. 

Expression of the M jannaschii HA fusion protein is detected by 
radiolabelling and immunoprecipitation, using methods described in, for example 
Hariow et ai. Antibodies: A Laboratory Manual, 2nd Ed.; Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York (1988). To this end, two days 
after transfection, the cells are labeled by incubation in media containing ^'S- 
cysteine for 8 hours. The cells and the media are collected, and the cells are 
washed and the lysed with detergent-containing RIPA buffer: 150 mM NaCl, 1% 
NP-40, 0.1% SDS, 1% NP-40, 0.5% DOC, 50 mM TRIS, pH 7.5, as described by 
Wilson et al. cited above. Proteins are precipitated from the cell lysate and from 
the culture media using an HA-specific monoclonal antibody. The precipitated 
proteins then are analyzed by SDS-PAGE gels and autoradiography. An 
expression product of the expected size is seen in the cell lysate, which is not seen 
in negative controls. 
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Example 6(b): Cloning and Expression in CHO Cells 

The vector pCl is used for the expression of a M jannaschii protein. 
Plasmid pCl is a derivative of the plasmid pSV2-dhfr [ATCC Accession No. 
37146]. Both plasmids contain the mouse DHFR gene under control of the SV40 
5 early promoter. Chinese hamster ovary- or other cells lacking dihydrofolate 

activity that are transfected with these plasmids can be selected by growing the 
cells in a selective medium (alpha minus MEM, Life Technologies) supplemented 
with the chemotherapeutic agent methotrexate. The amplification of the DHFR 
genes in cells resistant to methotrexate (MTX) has been well documented (see, 

10 e.g., Alt, F.W., Kellems, R.M., Bertino, J.R., and Schimke, R.T., 1978, J. Biol. 

Chem. 253:1357-1370, Hamlin, J.L. and Ma, C. 1990, Biochem. et Biophys. 
Acta, 1097:107-143, Page, MJ. and Sydenham, M.A. 1991, Biotechnology Vol. 
9:64-68). Cells grown in increasing concentrations of MTX develop resistance 
to the drug by overproducing the target enzyme, DHFR, as a result of 

1 5 amplification of the DHFR gene. If a second gene is linked to the DHFR gene it 

is usually co-amplified and over-expressed. It is state of the art to develop cell 
lines carrying more than 1,000 copies of the genes. Subsequently, when the 
methotrexate is withdrawn, cell lines contain the amplified gene integrated into 
the chromosome(s). 

20 Plasmid pC 1 contains for the expression of the gene of interest a strong 

promoter of the long terminal repeat (LTR) of the Rouse Sarcoma Virus (CuUen, 
et al. Molecular and Cellular Biology, March 1985:438-4470) plus a fragment 
isolated fi-om the enhancer of the immediate early gene of human 
cytomegalovirus (CMV) (Boshart et al. Cell '//:521-530, 1985). Downstream 

25 of the promoter are the following single restriction enzyme cleavage sites that 

allow the integration of the genes: BamHI, Pvull, and Nml. Behind these cloning 
sites the plasmid contains translational stop codons in all three reading frames 
followed by the 3 ' intron and the polyadenylation site of the rat preproinsulin 
gene. Other high efficient promoters can also be used for the expression, e.g., the 

30 human (i-actin promoter, the SV40 early or late promoters or the long terminal 
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repeats from other retroviruses, e,g,, HIV and HTLVI. For the polyadenylation 
of the mRNA other signals, e.g., from the human growth hormone or globin genes 
can be used as well. 

Stable cell lines carrying the gene of interest integrated into the 
chromosomes can also be selected upon co-transfection with a selectable marker 
such as gpt, G418 or hygromycin. It is advantageous to use more than one 
selectable marker in the beginning, e.g., G418 plus methotrexate. 

The plasmid pCl is digested with the restriction enzyme BamHI and then 
dephosphorylated using calf intestinal phosphates by procedures knovm in the art. 
The vector is then isolated from a 1% agarose gel. 

The M jannaschii protein-encoding sequence is is amplified using PGR 
oligonucleotide primers as described above. An efficient signal for initiation of 
translation in eukaiyotic cells, as described by Kozak, M., J. Mol. Biol. 196:947- 
950 (1987) is appropriately located in the vector portion of the construct. The 
amplified fragments are isolated from a 1% agarose gel as described above and 
then digested with the endonucleases BamHI and Asp71 8 and then purified again 
on a 1% agarose gel. 

The isolated fragment and the dephosphorylated vector are then ligated 
with T4 DNA ligase. E. coli HBlOl cells are then transformed and bacteria 
identified that contained the plasmid pCl inserted in the correct orientation using 
the restriction enzyme BamHI. The sequence of the inserted gene is confirmed 
by DNA sequencing. 

Transfection of CHO-DHFR-cells 

Chinese hamster ovary cells lacking an active DHFR enzyme are used for 
transfection. 5 jig of the expression plasmid CI are cotransfected with 0.5 |ig of 
the plasmid pSVneo using the lipofecting method (Feigner et al, supra). The 
plasmid pS V2-neo contains a dominant selectable marker, the gene neo from Tn5 
encoding an enzyme that confers resistance to a group of antibiotics including 
G418. The cells are seeded in alpha minus MEM supplemented with 1 mg/ml 
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G41 8. After 2 days, the cells are trypsinized and seeded in hybridoma cloning 
plates (Greiner, Germany) and cultivated from 10-14 days. After this period, 
single clones are trypsinized and then seeded in 6-well petri dishes using different 
concentrations of methotrexate (25 nM, 50 nM, 100 nM, 200 nM, 400 nM), 
Clones growing at the highest concentrations of methotrexate are then transferred 
to new 6-well plates containing even higher concentrations of methotrexate (500 
nM, 1 |aM, 2 fiM, 5 fiM). The same procedure is repeated until clones grow at 
a concentration of 100 |iM. 

The expression of the desired gene product is analyzed by Western blot 
analysis and SDS-PAGE. 

Example 7 

Production of an Antibody to a Methanococcus jannaschii Protein 

Substantially pure M. jannaschii protein or polypeptide is isolated from 
the transfected or transformed cells described above using an art-known method. 
The protein can also be chemically synthesized. Concentration of protein in the 
final preparation is adjusted, for example, by concentration on an Amicon filter 
device, to the level of a few micrograms/ml. Monoclonal or polyclonal antibody 
to the protein can then be prepared as follows: 

Monoclonal Antibody Production by Hybridoma Fusion 

Monoclonal antibody to epitopes of any of the peptides identified and 
isolated as described can be prepared from murine hybridomas according to the 
classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or 
modifications of the methods thereof Briefly, a mouse is repetitively inoculated 
with a few micrograms of the selected protein over a period of a few weeks. The 
mouse is then sacrificed, and the antibody producing cells of the spleen isolated. 
The spleen cells are fiised by means of polyethylene glycol with mouse myeloma 
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cells, and the excess unfused cells destroyed by growth of the system on selective 
media comprising aminopterin (HAT media). The successftiUy fused cells are 
diluted and aliquots of the dilution placed in wells of a microtiter plate where 
growth of the culture is continued. Antibody-producing clones are identified by 
detection of antibody in the supernatant fluid of the wells by immunoassay 
procedures, such as ELISA, as originally described by Engvall, E., Meth. 
Enzymol 70:419 (1980), and modified methods thereof. Selected positive clones 
can be expanded and their monoclonal antibody product harvested for use. 
Detailed procedures for monoclonal antibody production are described in Davis. 
L. et al. Basic Methods in Molecular Biology Elsevier, New York. Section 2 1 -2 
(1989). 

Polyclonal Antibody Production by Immunization 

Polyclonal antiserum containing antibodies to heterogenous epitopes of 
a single protein can be prepared by immunizing suitable animals with the 
expressed protein described above, which can be unmodified or modified to 
enhance immunogenicity. Effective polyclonal antibody production is affected 
by many factors related both to the antigen and the host species. For example, 
small molecules tend to be less immunogenic than other molecules and may 
require the use of carriers and adjuvant. Also, host animals vary in response to 
site of inoculations and dose, wdth both inadequate or excessive doses of antigen 
resulting in low titer antisera. Small doses (ng level) of antigen administered at 
multiple intradermal sites appears to be most reliable. An effective immunization 
protocol for rabbits can be found in Vaitukaitis, J. et al , J. Clin. Endocrinol 
Metab. 55:988-991 (1971). 

Booster injections can be given at regular intervals, and antiserum 
harvested when antibody titer thereof, as determined semi-quantitatively, for 
example, by double immunodiffusion in agar against known concentrations of the 
antigen, begins to fall {See Ouchterlony, O. et al. Chap. 19 in: Handbook of 
Experimental Immunology, Wier, D., ed, Blackwell (1973)). Plateau 
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concentration of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum 
(about 1 2 ^M). Affinity of the antisera for the antigen is determined by preparing 
competitive binding curves, as described, for example, by Fisher, D., Chap. 42 in: 
Manual of Clinical Immunology, second edition, Rose and Friedman, (eds.), 
5 Amer. Soc. For Microbio., Washington, D.C. (1980). 

Antibody preparations prepared according to either protocol are useful in 
quantitative inmiunoassays which determine concentrations of antigen-beaiing 
substances in biological samples; they are also used semi-quantitatively or 
qualitatively to identify the presence of antigen in a biological sample. 
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thermoautotrophicum } ^ 




3-hydroxy-3-methylglutaryl coenzyme A reductase {Haloferax volcanii} 


acyl carrier protein synthase {Pyrococcus furiosus} 


bifiinctional short chain isoprenyl diphosphate synthase {Methanobacteri 
thermoautotrophicum } 
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lipopolysaccharide biosynthesis protein (bplD) {Bordetella pertussis} 


melvalonate kinase {Schizosaccharomyces pombe} 


nonspecific lipid-transfer protein {Pyrococcus furiosus} 
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anaerobic ribonucleoside-triphosphate reductase {Escherichia coli} 


deoxycytidine triphosphate deaminase {Desulfurolobus ambivalens} 


deoxycytidine triphosphate deaminase, putative (Desulfurolobus ambival 


deoxyuridylate hydroxymethylase {Methanobacterium thermoautotrophic 


glycinamide ribonucleotide synthetase {Homo sapiens} 
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hypothetical protein (PIR:S21569) {Methanobacterium thermoautotrophicum} 


hypothetical protein (PIR:S28724) {Methanococcus vannieiii} 


hypothetical protein (PIR:S38467) {Desulfurococcus mobilis} 


hypothetical protein (PIR:S41581) {Methanothermus fervidus} 


hypothetical protein (PIR:S41583) {Methanothermus fervidus} 


hypothetical protein (PIR:S49379) {Pseudomonas aeruginosa} 


hypothetical protein (PIR:S51413) {Saccharomyces cerevisiae} 


hypothetical protein (PIR:S51413) {Saccharomyces cerevisiae} 


hypothetical protein (PIR:S51413) {Saccharomyces cerevisiae} 


hypothetical protein (PIR:S51868) {Saccharomyces cerevisiae} 


hypothetical protein (PIR:S52522) {Saccharomyces cerevisiae} 


hypothetical protein (PIR:S52979) {Erwinia herbicola} 


hypothetical protein (PIR:S53543) {Saccharomyces cerevisiae} 


hypothetical protein (SP:P05409) {Methanococcus thermolithotrophicus} 


hypothetical protein (SP:P1 1666) {Escherichia coli} 


hypothetical protein (SP:P12049) {Bacillus subtilis} 


hypothetical protein (SP:P]4021) {Methanococcus vannieiii} 


hypothetical protein (SP:P14022) {Methanococcus vannieiii} 


hypothetical protein (SP:P14027) {Methanococcus vannieiii} 
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hypothetical protein (SP:P15886) {Methanococcus vannielii} 


hypothetical protein (SP:P15889) {Thermofilum pendens} 


hypothetical protein (SP.P22349) {Methanobrevibacter smith 


hypothetical protein (SP:P25125) {Thermus aquaticus} 


hypothetical protein (SP:P25768) {Methanobacterium ivanovi 


hypothetical protein (SP:P28910) {Escherichia coli} 


hypothetical protein (SP:P29202) (Haloarcula marismortui} 


hypothetical protein (SP:P31065) {Escherichia coli,} 


hypothetical protein (SP:P31466) {Escherichia coli} 


hypothetical protein (SP:P3 1473) {Escherichia coli} 


hypothetical protein (SP:P31473) {Escherichia coli} 


hypothetical protein (SP:P31806) {Escherichia coli} 


hypothetical protein (SP:P32639) (Saccharomyces cerevisiae} 
cerevisiae} 


hypothetical protein (SP:P32698) (Escherichia coli} 


hypothetical protein (SP;P33382) (Listeria monoc>logenes} 


hypothetical protein (SP:P33382) (Listeria monocytogenes} 


hypothetical protein (SP:P34222) (Saccharomyces cerevisiae} 


hypothetical protein (SP:P37002) (Escherichia coli} 
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hypothetical protein (SP:P37487) {Bacillus subtilis} 


hypothetical protein (SP:P37528) {Bacillus subtilis} 


hypothetical protein (SP:P37545) {Bacillus subtilis} 


hypothetical protein (SP:P37555) {Bacillus subtilis} 


hypothetical protein (SP:P37869) {Bacillus subtilis} 


hypothetical protein (SP:P37872) {Bacillus subtilis} 


hypothetical protein (SP:P38423) {Bacillus subtilis} {Bacillus subtilis} 


hypothetical protein (SP:P38619) {Sulfolobus acidocaldarius} 


hypothetical protein (SP:P39164) {Escherichia coli} 


hypothetical protein (SP:P39364) {Escherichia coli} 


hypothetical protein (SP:P39587) {Bacillus subtilis} 


hypothetical protein (SP:P42297) {Bacillus subtilis} 


hypothetical protein (SP:P42297) {Bacillus subtilis} 


hypothetical protein (SP:P42404) {Bacillus subtilis} 


hypothetical protein (SP:P45476) {Escherichia coli} 


hypothetical protein (SP:P46348) {Bacillus subtilis} 


hypothetical protein (SP:P46850) {Escherichia coli} 


hypothetical protein (SP:P4685]) {Escherichia coli} 


hypothetical protein GP:L07942_2 {Escherichia coli} 
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MJ0240 


231618 


231094 


MJ0241 


232062 


231628 


MJ0243 


232563 


232318 


MJ0248 


235142 


23565! 


MJ0251 


238728 


238288 


MJ0252 


238849 


239487 


MJ0255 


241359 


240607 


MJ0257 


242764 


243696 


MJ0258 


245039 


243840 


MJ0259 


245717 


245112 


MJ0261 


247082 


246423 


MJ0263 


251686 


250727 


MJ0270 


256421 


256188 


MJ0271 


256902 


257441 


MJ0272 


257452 


257649 


MJ0273 


258107 


258412 


MJ0274 


260378 


258819 


MJ0275 


261121 


260516 


MJ0280 


266375 


266758 


MJ028I 


267291 


266761 


MJ0282 


267341 


267787 


MJ0284 


269902 


269174 


MJ0286 


270849 


270499 


MJ0287 


271160 


270870 


MJ0288 


271755 


271222 


MJ0289 


272805 


271801 


MJ0290 


273753 


273121 


MJ0292 


275409 


275137 


MJ0296 


279767 


280360 


MJ0297 


281155 


280406 
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MJ0298 


281290 


281739 


MJ0301 


285101 


284220 


MJ0303 


285971 


285558 


MJ0305 


286594 


287778 


MJ0306 


287997 


287818 


MJ0308 


289084 


288386 


MJ0310 


290609 


290268 


MJ0311 


290981 


290652 


MJ0312 


291845 


291228 


MJ0314 


293767 


294369 


MJ0315 


294826 


294455 


MJ0316 


295458 


294964 


MJ0317 


296374 


295733 


MJ0319 


297675 


297902 


MJ0320 


298001 


298645 


MJ032I 


298675 


299040 


MJ0325 


302095 


301172 


MJ0327 


303625 


303927 


MJ0328 


304755 


304318 


MJ0329 


306607 


304760 


MJ0330 


308266 


306620 


MJ0331 


308670 


308266 


MJ0332 


308995 


308678 


MJ0333 


309670 


309410 


MJ0334 


309816 


310112 


MJ0335 


310179 


310919 


MJ0336 


310932 


311288 


MJ0337 


311299 


312084 


MJ0338 


312100 


312402 


MJ0339 


312374 


312694 
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MJ0340 


312697 


313398 


MJ0341 


313411 


313770 


MJ0342 


313918 


314286 


MJ0343 


314270 


316807 


MJ0344 


316820 


317359 


MJ0345 


317314 


3 1 8264 


MJ0346 


318277 


318579 


MJ0347 


318593 


319045 


MJ0348 


319620 


321995 


MJ0349 


322367 


322053 


MJ0350 


322681 


322418 


MJ0351 


323154 


322705 


MJ0352 


323901 


323185 


MJ0353 


324142 


323891 


MJ0354 


324296 


324123 


MJ0355 


324661 


324374 


MJ0356 


324957 


324697 


MJ0357 


326407 


325943 


MJ0358 


326796 


326413 


MJ0359 


327449 


326808 


MJ0360 


328174 


327770 


MJ0361 


329502 


329182 


MJ0362 


329659 


329847 


MJ0364 


332163 


332495 


MJ0365 


332503 


333030 


MJ0366 


333033 


333308 


MJ0368 


334581 


334886 


MJ0369 


336040 


334934 


MJ0371 


337418 


337639 


MJ0374 


339873 


338884 
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MJ0375 


339920 


34068 1 


MJ0377 


343243 


343752 


MJ0378 


343921 


344886 


MJ0379 


345500 


344889 


MJ0380 


345657 


345974 


MJ0381 


345977 


346936 


MJ0382 


346955 


347683 


MJ0383 


347677 


3495 1 8 


MJ0384 


349546 


350259 


MJ0385 


350252 


351304 


MJ0386 


351648 


351307 


MJ0390 


355149 


354760 


MJ0395 


357787 


357314 


MJ0398 


359111 


359923 


MJ0400 


361593 


36241 1 


MJ0401 


362717 


362520 


MJ0402 


363046 


362729 


MJ0404 


364804 


364355 


MJ0405 


365385 


365002 


MJ0408 


367518 


367880 


MJ0409 


367946 


370054 


MJ0410 


370074 


370865 


MJ0414 


374603 


373419 


MJ0415 


374712 


375197 


MJ0416 


375222 


375791 


MJ0417 


376510 


375800 


MJ0418 


376627 


377388 


MJ0419 


377369 


378430 


MJ0420 


378394 


379533 


MJ0421 


379640 


380719 



wo 98/07830 PCTAJS97/14900 

-125- 



MJ0423 


381855 


382031 


MJ0424 


382046 


382336 


MJ0425 


382317 


382712 


MJ0426 


383243 


382704 


MJ0427 


383719 


383243 


MJ043 1 


387350 


387135 


MJ0432 


388127 


387852 


MJ0433 


388663 


388139 


MJ0434 


389342 


388677 


MJ0435 


389620 


389342 


MJ0437 


391903 


391667 


MJ0439 


394280 


393234 


MJ0440 


394492 


395292 


MJ0444 


398609 


397740 


MJ0447 


401037 


400555 


MJ0448 


401168 


401935 


MJ0450 


403277 


403834 


MJ0452 


404962 


404519 


MJ0453 


405287 


404967 


MJ0455 


406863 


406285 


MJ0456 


406888 


407943 


MJ0459 


410088 


410354 


MJ0480 


422470 


423063 


MJ0481 


423792 


424085 


MJ0482 


423793 


423074 


MJ0485 


427056 


428102 


MJ0488 


432390 


432854 


MJ049I 


434681 


435106 


MJ0492 


435385 


435101 


MJ0494 


436499 


436891 
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MJ0496 


438482 


438823 


MJ0497 


439219 


438821 


MJ0498 


439679 


439212 


MJ0500 


442304 


441537 


MJ0501 


442990 


442394 


MJ0504 


445785 


446372 


MJ0505 


446365 


447117 


MJ0512 


453993 


453292 


MJ0513 


454868 


454149 


MJ0517 


45973 1 


459321 


MJ0518 


460018 


459737 


MJ0519 


460275 


460033 


MJ0521 


461746 


461549 


MJ0522 


462422 


461769 


MJ0523 


463226 


462534 


MJ0524 


463697 


463239 


MJ0525 


463997 


463839 


MJ0526 


464308 


464123 


MJ0527 


465146 


464655 


MJ0528 


465442 


465149 


MJ0529 


466215 


465520 


MJ0538 


474805 


474026 


MJ0539 


476422 


474833 


MJ0540 


476947 


476693 


MJ0541 


477507 


476971 


MJ0545 


483451 


482711 


MJ0546 


483623 


483456 


MJ0548 


485032 


484589 


MJ0550 


487106 


486012 


M JOS 51 


487918 


487106 
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MJ0553 


489383 


488925 


MJ0554 


490365 


489910 


MJ0556 


492396 


491875 


MJ0557 


493186 


492572 


MJ0558 


493984 


493202 


MJ0560 


495301 


494891 


MJ0562 


496903 


496691 


MJ0565 


502486 


502046 


MJ0567 


504742 


504497 


MJ0568 


504847 


505221 


M JOS 70 


506837 


506112 


MJ0572 


509860 


510117 


MJ0573 


510262 


510828 


MJ0574 


510865 


511143 


MJ0575 


511121 


51 1807 


MJ0580 


515428 


515075 


MJ0581 


515692 


515937 


MJ0582 


515940 


516323 


MJ0583 


516393 


516563 


MJ0584 


516563 


517657 


MJ0585 


517680 


518294 


MJ0586 


518563 


519057 


MJ0587 


519994 


519536 


MJ0589 


521451 


521768 


MJ0592 


525620 


526357 


MJ0594 


526886 


527392 


MJ0596 


528074 


528475 


MJ0597 


528539 


529612 


MJ0599 


530524 


531120 


MJ0602 


533752 


532970 
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MJ0604 


535443 


535144 


MJ0605 


535634 


535443 


MJ0606 


536194 


535922 


MJ0607 


536435 


536199 


MJ0610 


540394 


539093 


MJ0614 


545444 


545061 


MJ0618 


547877 


547584 


MJ0619 


549378 


547861 


MJ0621 


551088 


550573 


MJ0623 


552787 


553362 


MJ0625 


553606 


554613 


MJ0626 


554709 


555335 


MJ0627 


555369 


555719 


MJ0628 


555715 


556203 


MJ0629 


556208 


556849 


MJ0632 


558292 


559380 


MJ0634 


562682 


564565 


MJ0635 


564797 


565636 


MJ0638 


568586 


567912 


MJ0639 


568870 


568586 


MJ0642 


571462 


572451 


MJ0645 


574498 


574743 


MJ0646 


574757 


575248 


MJ0647 


575457 


575296 


MJ0648 


575881 


575441 


MJ0650 


577458 


579521 


MJ0652 


580869 


580471 


MJ0659 


585626 


586039 


MJ0660 


586366 


586136 


MJ0661 


587014 


586496 
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MJ0662 


587657 


587007 


MJ0664 


589291 


590163 


MJ0665 


590629 


590180 


MJ0668 


594556 


594314 


MJ0670 


596945 


595887 


MJ0675 


601925 


600753 


MJ0678 


605240 


604263 


MJ0683 


611696 


610920 


MJ0686 


615407 


613668 


MJ0687 


616482 


615478 


MJ0688 


616670 


617110 


MJ0690 


617965 


617375 


MJ069] 


618300 


617974 


MJ0694 


620244 


621365 


MJ0695 


621809 


621486 


MJ0696 


622409 


621933 


MJ0699 


625837 


624698 


MJ0700 


625851 


626822 


MJ0701 


62683 1 


628063 


MJ0702 


628050 


629831 


MJ0703 


629859 


630536 


MJ0704 


631069 


632199 


MJ0706 


633440 


634081 


MJ0708 


634868 


634425 


MJ0711 


643995 


644960 


MJ0712 


645967 


644963 


MJ0714 


648530 


648880 


MJ0716 


650013 


650270 


MJ0717 


650815 


650459 


MJ0724 


657809 


657189 
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MJ0730 


663605 


663048 


MJ0731 


664213 


663620 


MJ0733 


665883 


665521 


MJ0737 


667834 


667652 


MJ0738 


668149 


667877 


MJ0739 


668627 


668175 


MJ0742 


669819 


669496 


MJ0745 


672208 


671675 


MJ0747 


673416 


672961 


MJ0749 


675903 


675151 


MJ0750 


676710 


675997 


MJ0751 


677628 


676795 


MJ0752 


677942 


677715 


MJ0753 


678766 


678146 


MJ0754 


679347 


678775 


MJ0755 


680644 


679619 


MJ0756 


681296 


680889 


MJ0757 


682155 


681424 


MJ0758 


682653 


682213 


MJ0759 


683029 


682700 


MJ0760 


683871 


683047 


MJ0761 


684833 


684072 


MJ0763 


686251 


685889 


MJ0764 


686611 


686264 


MJ0766 


688821 


688729 


MJ0767 


689531 


689100 


MJ0768 


689589 


690335 


MJ0769 


690987 


690481 


MJ0770 


691651 


690983 


MJ0772 


692429 


693487 
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MJ0773 



MJ0774 



694540 



695228 



694016 



696454 



MJ0775 



696438 



697379 



MJ0776 



697375 



698523 



MJ0777 



698474 



699046 



MJ0778 



699097 



699603 



MJ0779 



700509 



699613 



MJ0780 



MJ0783 



701537 



706171 



700533 



706737 



MJ0786 



710078 



710620 



MJ0788 



712303 



712539 



MJ0789 



712625 



712972 



MJ0790 



713001 



713696 



MJ0792 



715511 



715777 



MJ0793 



716398 



716931 



MJ0794 



716992 



717405 



MJ0795 



MJ0797 



717488 



720647 



718999 



721759 



MJ0798 



721779 



722780 



MJ0799 



722786 



723667 



MJ080] 



MJ0802 



725037 



726398 



726173 



726961 



MJ0803 



MJ0804 



MJ0805 



726984 



727530 



728332 



727499 



728387 



728994 



MJ0807 



MJ0808 
MJ0809 
MJ0810 



730149 



730806 
733025 



733584 



730670 



731804 
733525 



734255 



MJ0811 



735675 



734359 
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MJ0815 


739584 


738697 


MJ0816 


740542 


739652 


MJ0817 


741119 


740502 


MJ0818 


741733 


741125 


MJ0819 


742225 


741899 


MJ0820 


742295 


742191 


MJ0821 


742765 


742598 


MJ0823 


744830 


745600 


MJ0826 


747462 


747875 


MJ0830 


750568 


750101 


MJ083 1 


750950 


752245 


MJ0833 


758976 


758239 


MJ0834 


759796 


759083 


MJ0835 


760901 


759822 


MJ0836 


762786 


762430 


MJ0837 


762860 


763606 


MJ0838 


764466 


764816 


MJ0839 


765906 


764857 


MJ0840 


765992 


766972 


MJ0841 


768225 


766981 


MJ0856 


780538 


779996 


MJ0857 


781920 


781099 


MJ0858 


782318 


781980 


MJ0859 


782837 


782355 


MJ0865 


788311 


789585 


MJ0871 


795055 


795975 


MJ0872 


797236 


796022 


MJ0874 


798213 


798491 


MJ0875 


79861 1 


800854 


MJ0878 


803147 


804388 
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MJ0880 


805402 


806325 


MJ0883 


808397 


809404 


MJ0887 


818880 


818209 


MJ0889 


819606 


821000 


MJ0890 


821429 


821019 


MJ0894 


824064 


824486 


MJ0895 


824467 


825492 


MJ0896 


825552 


825953 


MJ0897 


825946 


826362 


MJ0898 


826495 


826932 


MJ0899 


826954 


827643 


MJ0900 


827668 


829308 


MJ0901 


829430 


830998 


MJ0902 


831028 


831729 


MJ0903 


831942 


833855 


MJ0904 


834299 


834547 


MJ0905 


834622 


834954 


MJ0906 


834959 


836056 


MJ0907 


836917 


836072 


MJ0909 


840933 


841220 


MJ0910 


841954 


841433 


MJ0912 


843688 


844416 


MJ0914 


845908 


845783 


MJ0915 


847507 


846707 


MJ0916 


847875 


847609 


MJ0917 


847950 


849671 


MJ0919 


850996 


850550 


MJ0921 


852470 


851571 


MJ0923 


853368 


854258 


MJ0925 


855529 


855212 
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MJ0926 


856378 


856638 


MJ0933 


862692 


863390 


MJ0935 


864824 


865447 


MJ0936 


865545 


866042 


MJ0938 


868207 


867473 


MJ0939 


868278 


869102 


MJ0943 


875111 


873870 


MJ0944 


875300 


875659 


MJ0945 


876358 


875687 


MJ0948 


881231 


880668 


MJ0949 


881637 


881269 


MJ0950 


882370 


881684 


MJ0951 


883634 


882570 


MJ0953 


884488 


884787 


MJ0954 


886106 


884802 


MJ0956 


887437 


888216 


MJ0957 


888219 


889268 


MJ0958 


889276 


890553 


MJ0962 


894937 


895320 


MJ0966 


899875 


901197 


MJ0967 


901940 


901326 


MJ0968 


901996 


902814 


MJ0969 


903935 


903126 


MJ0970 


904627 


904199 


MJ0971 


904756 


905844 


MJ0972 


905808 


906488 


MJ0973 


907728 


906496 


MJ0974 


908172 


907741 


MJ0975 


908365 


908162 


MJ0976 


908463 


909560 
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MJ0977 


909594 


911000 


MJ0978 


911359 


91 1688 


MJ0979 


912309 


911719 


MJ0981 


914246 


913641 


MJ0986 


917606 


917373 


MJ0987 


917909 


918247 


MJ0988 


918361 


919347 


MJ0991 


920189 


920608 


MJ0992 


920924 


921 142 


MJ0995 


924316 


923636 


MJ0997 


925109 


925719 


MJ0998 


926425 


926012 


MJ1002 


930965 


931891 


MJ1004 


933349 


933990 


MJ1005 


933994 


934386 


MJ1006 


934412 


935437 


MJlOlO 


941079 


939958 


MJlOll 


941860 


941471 


MJ1016 


946060 


946941 


MJ1017 


946934 


947542 


MJ1020 


950418 


951194 


MJ1021 


951732 


951244 


MJ1022 


953674 


951968 


MJ1024 


954536 


955744 


MJ1025 


956917 


955751 


MJ1028 


959569 


961611 


MJ1030 


962492 


962932 


MJ1032 


963985 


965082 


MJ1034 


966050 


966310 


MJ1036 


967587 


968276 
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MJ1049 


986885 


987367 


MJ1050 


987438 


987968 


MJ1052 


989793 


989503 


MJ1053 


990349 


989861 


MJ1060 


1000457 


1002067 


MJ1067 


1008238 


1008681 


MJ1069 


1010805 


1009630 


MJ1070 


1011399 


1010929 


MJ1071 


1012337 


101 1399 


MJ1072 


1012709 


1012362 


MJ1073 


1013688 


1012879 


MJ1074 


1014135 


1013800 


MJ1076 


1016646 


1015636 


MJ1077 


1018245 


1016683 


MJ1078 


1019039 


1018338 


MJ1079 


1020506 


1019316 


MJ1080 


1021091 


1020687 


MJ1082 


1021657 


1022016 


MJ1083 


1022089 


1022667 


MJ1085 


1023633 


1025159 


MJ1086 


1025159 


1026178 


MJ1092 


1030102 


1030743 


MJ1094 


1033051 


1031897 


MJ1095 


1034350 


1033088 


MJ1098 


1039265 


1038627 


MJ1099 


1040323 


1039619 


MJ1103 


1043990 


1043727 


MJ1106 


1046606 


1046052 


MJ1107 


1047073 


1046627 


MJlllO 
1— 


1052574 


1051117 1 
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MJl 1 1 1 


1053691 


1052540 


MJ1112 


1053818 


1053645 


MJllM 


1055795 


1055220 


MJl 117 


1058450 


1059037 


MJ1118 


1059065 


1059331 


MJl 120 


1060339 


1061175 


MJl 121 


1061532 


1061251 


MJl 122 


1061729 


1061508 


MJI123 


1061809 


1062423 


MJl 125 


1066578 


1 066399 


MJl 126 


1067325 


1068140 


MJl 127 


1068204 


1069043 


MJl 128 


1069964 


1069050 


MJl 132 


1073401 


1073048 


MJ1I34 


1075567 


1074881 


MJl 137 


1078625 


1078035 


MJl 138 


1078694 


1079215 


MJ1139 


1080031 


1079336 


MJl 140 


1080732 


1080049 


MJl 141 


1080810 


1081406 


MJl 143 


1082498 


1083604 


MJl 144 


1084575 


1083607 


MJl 145 


1085112 


1084918 


MJ1147 


1086431 


1087786 


MJ1150 


1088688 


1089230 


MJ1151 


1089352 


1089681 


MJ 1 1 52 


1089693 


1089902 


MJl 153 


1089902 


1090087 


MJl 154 


1091598 


1090246 


MJl 157 


1097614 1 


098636 
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MJ1158 


1 09763 1 


1097245 


MJ1159 


1098676 


1100610 


1VIJ1161 


1102129 


1 102629 


MJ1163 


1 104052 


1 104747 


MJ1164 


1106045 


1105095 


MJn72 


1111539 


1111781 


MJ1173 


1111785 


1112066 


MJ1177 


1117451 


1 1 1 8467 


MJl 179 


1 11 8839 


1 1 19285 


MJl 180 


1119545 


1119979 


MJ1181 


1120081 


1120677 


MJl 182 


1121087 


1122184 


MJ1183 


1 122200 


1 122670 


MJl 184 


1122741 


1123160 


MJ1185 


1125032 


1123167 


MJ1186 


1125194 


1126231 


MJ1188 


1127047 


1126238 


MJ1189 


1128908 


1128060 


MJl 198 


1 142323 


1144605 


MJ1199 


1145059 


1144631 


MJl 205 


1148679 


1148371 


MJl 206 


1149937 


1148675 


MJl 207 


1150577 


1151254 


MJl 209 


1 1 54047 


1152613 


MJ1210 


1154918 


1154148 


MJ1211 


1155290 


11 54943 


MJ1213 


1156520 


1156191 


MJ1215 


1159884 


1 1 59639 


MJ1216 


1160233 


1 159871 


MJ1217 


1160540 


1160247 
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MJ1219 


1162177 


1161875 


MJ1221 


1164080 


1164958 


MJ1222 


1165703 


1164984 


MJ1223 


1165956 


1165681 


MJ1224 


1167016 


1166600 


MJ1230 


1173450 


1173235 


MJ1232 


1176334 


1175447 


MJ1233 


1176475 


1 1 773 1 1 


MJ1234 


1178669 


1 1 77947 


MJ1239 


1184644 


1185318 


MJ1240 


1185617 


1185327 


MJ1241 


1185877 


1 1 85644 


MJ1243 


1187992 


1 1 87624 


MJI244 


1188410 


1188087 


MJ1245 


1188760 


1188425 


MJ1248 


1191184 


1190723 


MJ1249 


1191367 


1192449 


MJ1250 


1 192973 


1193731 


MJ1254 


1197164 


1197400 


MJ1255 


1 197430 


1198611 


MJ1256 


1198911 


1199543 


MJ1257 


1199543 


1200589 


MJ1262 


1204364 


1205530 


MJ1272 


1216145 


1216633 


MJ1278 


1223720 


1223184 


MJ1279 


1224266 


1223724 


MJI280 


1224460 


1224930 


MJ1281 


1224854 


1227994 


MJ1282 


1228714 


1229769 


MJ1283 


1231676 


1231017 
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MJ1284 


1232029 


1231667 


MJ1285 


1232580 


1232029 


MJI286 


1234269 


1232587 


MJ1287 


1235086 


1234319 


MJ1288 


1235901 


1235155 


MJ1289 


1236778 


1236284 


MJ1290 


1237713 


1236778 


MJ129] 


1238448 


1237729 


MJ1292 


1238662 


1241124 


MJ1293 


1241174 


1241866 


MJ1295 


1243251 


1242847 


MJ1301 


1250120 


1248921 


MJ1302 


1250541 


1250149 


MJ1304 


1252617 


1252162 


MJ1305 


1253036 


1252596 


MJ1306 


1253300 


1253052 


MJ1307 


1254110 


1253325 


MJ1308 


1254426 


1254H5 


MJ1309 


1255877 


1254459 


MJ1310 


1256325 


1255942 


MJ1311 


1256457 


1257287 


MJ1312 


1257321 


1258283 


MJI313 


1258388 


1259596 


MJ1315 


1260519 


1261589 


MJ1316 


1261606 


1261833 


MJ1317 


1263015 


1261822 


MJ1318 


1264868 


1263063 


MJ1320 


1268194 


1267802 


MJ1321 


1270356 


1268218 


MJ1322 


1273392 


1270378 
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MJ1323 


1274489 


1273392 


MJ1325 


1275428 


1275694 


MJ1327 


1277081 


1277815 


MJ1330 


1280424 


1280792 


MJ1331 


1281220 


1280801 


MJ1333 


1282515 


1282766 


MJ1336 


1284800 


1285282 


MJ1337 


1285743 


1286216 


MJ1339 


1287389 


1287850 


MJ1340 


1287925 


1288266 


MJ1341 


1289221 


1288286 


MJ1342 


1289457 


1289798 


MJI345 


1291918 


1292841 


MJ1348 


1295149 


1296126 


MJ1350 


1298227 


1297454 


MJ1354 


1304338 


1304772 


MJ1355 


1304858 


1306531 


MJ1356 


1306729 


1307295 


MJ1358 


1309040 


1308648 


MJ1359 


1309889 


1309164 


MJ1360 


1310249 


1309953 


MJ136I 


1310355 


1311230 


MJ1364 


1313354 


1314619 


MJ1369 


1318564 


1319028 


MJ1370 


1319061 


1320044 


MJ1371 


1320053 


1320775 


MJ1373 


1321601 


1322086 


MJ1374 


1322262 


1322954 


MJ1379 


1328524 


1328823 


MJ1380 


1328819 


1329052 
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MJ1382 



MJ1383 



1331473 



1332364 



I33I036 



1331597 



MJ1384 



MJ1385 



1333177 



1333741 



1332596 



1333205 



MJ1386 



1333877 



1334008 



MJ1387 



1335433 



1334297 



MJ1389 



MJ1393 



1337813 



1341979 



1337412 



1343802 



MJ1394 



MJ1395 



1343895 



1347176 



1346852 



1347571 



MJ1396 



MJ1397 



1347707 



1356457 



1356388 



1357905 



MJ1398 



MJ1399 



1358183 



1359929 



1359355 



1359339 



MJ1400 



MJ1401 



MJ1402 



1360142 



1360259 



1364357 



1359942 



1362682 



1363320 



MJ1403 



MJ1404 



1365794 



1366111 



1364673 



1367364 



MJ1405 



MJ1407 



1367427 



1368408 



1367639 



1368794 



MJ1409 



1370733 



1369939 



MJ1410 



1371310 



1370834 



MJ1412 



MJI414 
MJ1416 



1373210 



1375807 



1378350 



1374703 



1375094 



1376995 



MJ1419 



MJ1423 



MJ1424 



MJ1427 



1382016 



1394263 



1394481 



1396680 



1381714 



1393208 



1395002 



1397633 
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MJ1428 


1397643 


1399343 


MJ1429 


1399343 


1400842 


MJ1431 


1401322 


1402398 


MJ1433 


1402914 


1403654 


MJ1435 


1404402 


1404614 


MJ1436 


1404758 


1405048 


MJ1437 


1405055 


1405738 


MJI440 


1407288 


1408133 


MJ1442 


1412130 


1412735 


MJ1443 


1412784 


1413104 


MJ1445 


1414331 


1414858 


MJ1447 


1415840 


1416982 


MJ1448 


1416982 


1418571 


MJ1449 


1418577 


1419686 


MJ1450 


1419699 


1420811 


MJ1451 


1420869 


1422320 


MJ1452 


1422616 


1423392 


MJ1453 


1423398 


1423973 


MJ1455 


1425643 


1424729 


MJ1457 


1427021 


1427422 


MJ1458 


1427487 


1428140 


MJ1460 


1430419 


1429943 


MJ1461 


1431156 


1430560 


MJ1462 


1431506 


1431258 


MJ1463 


1432201 


1431530 


MJI466 


1436397 


1435756 


MJ1467 


1436562 


1437008 


MJ1468 


1437029 


1440055 


MJ1469 


1440055 


1440279 


MJ1470 


1440747 


1442618 
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MJ1471 


1442618 


1443151 


MJ1472 


1443165 


1444796 


MJ1475 


1446447 


1446821 


MJ1477 


1447530 


1448537 


MJ1478 


1449448 


1448540 


MJ1480 


1451452 


1452720 


MJ1481 


1452735 


1453373 


MJ1483 


1454337 


1454783 


MJ1484 


1454768 


1455217 


MJ1487 


1459016 


1460293 


MJ1488 


1460315 


1461493 


MJ1491 


1465684 


1466055 


MJ1492 


1466067 


1466534 


MJ1493 


1466552 


1467235 


MJ1495 


1468532 


1469377 


MJ1496 


1469370 


1469711 


MJ1497 


1469711 


1470748 


MJ1499 


1472128 


1471649 


MJ1500 


1472920 


1472363 


MJ1501 


1473615 


1472947 


MJ1503 


1474982 


1474587 


MJ1506 


1479963 


1478767 


MJ1507 


1480030 


1481214 


MJ1509 


1482024 


1482482 


MJ1510 


1483084 


1482506 


MJ15I1 


1483234 


1483572 


MJ1513 


1489601 


1488606 


MJ1514 


1489692 


1490078 


MJ1515 


1490084 


1491148 


MJ1516 


1491173 


1491466 



wo 98/07830 



PCT/US97/14900 



-145- 



MJ1517 


1492030 


1492863 


MJI518 


1492917 


1493975 


MJ1519 


1494094 


1497618 


MJ1520 


1498588 


1497656 


MJ152I 


1498905 


1500170 


MJ1524 


1501404 


1501727 


MJ1525 


1501702 


1 504500 


MJ1527 


1505607 


1505281 


MJ1535 


1512870 


1513766 


MJ1537 


1 5 1 5742 


1514714 


MJ1539 


1516728 


1 5 1 7042 


MJ1540 


1517209 


1517466 


MJ1542 


1521169 


1518746 


MJ1544 


1523759 


1522470 


MJ1545 


1523900 


1524592 


MJ1547 


1525820 


1526005 


MJ1548 


1526062 


1526427 


MJ1550 


1527849 


1528031 


MJ1551 


1528046 


1528216 


MJ1553 


1528749 


1529240 


MJ1554 


1529326 


1531191 


MJ1556 


1532701 


1533636 


MJ1557 


1533644 


1534390 


MJ1558 


1534666 


1534397 


MJ1559 


1534699 


1535262 


MJ1561 


1538168 


1536510 


MJ1562 


1539331 


1538168 


MJ1563 


1539812 


1539345 


MJ1564 


1540186 


1540695 


MJ1565 


1540699 


1542237 
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MJ1566 


1543572 


1542232 


MJ1567 


1 544072 


1543557 


MJ1568 


1544632 


1 544078 


MJ1570 


1545637 


1545981 


MJ157I 


1546111 


1 546986 


MJ1573 


1548452 


1548270 


MJ1576 


1551559 


1552164 


MJ1577 


1552197 


1553990 


MJ1579 


1555146 


1554937 


MJ1580 


1555498 


1555127 


MJi583 


1557431 


1557808 


MJ1584 


1558268 


1557816 


MJ1585 


1559172 


1558255 


MJ1587 


1560732 


1561265 


MJ1588 


1561285 


1561620 


MJ1589 


1561657 


1 562379 


MJ1590 


1562770 


1563084 


MJ1595 


1567357 


1566332 


MJ1598 


1572075 


1571026 


MJ1599 


1572924 


1572094 


MJ1600 


1573002 


1573532 


MJ160] 


1573539 


1574018 


IVIJ1604 


1578693 


1577308 


MJ1608 


1582917 


1583126 


MJ1609 


1583168 


1584289 


MJ1613 


1589822 


1589058 


MJ1614 


1590582 


1589830 


MJ1615 


1591350 


1590586 


MJ1617 


1593103 


1593381 


MJ1618 


1593786 


1593397 
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MJ1620 


1594531 


1596084 


MJ1621 


1596297 


1596127 


MJ1622 


1597169 


1597719 


MJ1623 


1597939 


1599474 


MJ1624 


1599991 


1599602 


MJ1626 


1602381 


1600087 


MJ1627 


1604683 


1604231 


MJ1628 


1606127 


1604784 


MJ1629 


1607293 


1606418 


MJI630 


1610737 


1607330 


MJ1631 


161 1 184 


1612740 


MJ1632 


1612697 


1613446 


MJ1633 


1614897 


1613467 


MJ1634 


1615733 


1615011 


MJ1635 


1615933 


1617174 


MJI637 


1618268 


1619686 


MJ1638 


1620457 


1619678 


MJI639 


1620605 


1621036 


MJ1640 


1621671 


1621057 


MJ164] 


1622664 


1621804 


MJ1642 


1623032 


1623514 


MJ1644 


1627146 


1627667 


MJ1646 


1628442 


1629074 


MJ1650 


1632586 


1631435 


MJ1651 


1633407 


1632631 


MJ1653 


1635797 


1636951 


MJ1654 


1637097 


1637693 


MJ1657 


1639687 


1640427 


MJ1658 


1640511 


1640783 


MJ1659 


1640800 


1641870 
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MJ1660 


1641857 


1643503 


MJ1664 


1646502 


1647179 


MJ1665 


1648555 


1647182 


MJ1666 


1650080 


1648686 


MJ1667 


1651336 


1650083 


MJ1668 


1652321 


1651194 


MJ1669 


1653119 


1652376 


MJ1670 


1653547 


1653149 


MJ167] 


1653684 


1653550 


MJ1672 


1656206 


1653807 


MJ1673 


1656630 


1656244 


MJ1674 


1658539 


1656638 


MJ1676 


1659621 


1660334 


MJ1678 


1660939 


1662126 


MJ1679 


1662142 


1662432 


MJ1680 


1662411 


1662866 


MJ1681 


1663887 


1662862 


MJECSOI 


1268 


432 


MJECS02 


4814 


1272 


MJECS03 


5192 


4851 


MJECS04 


5884 


5459 


MJECS05 


6365 


6814 


MJECS06 


7443 


7009 


MJECS07 


8765 


7428 


MJECS08 


11950 


8738 


MJECS09 


12641 


11925 


MJECSIO 


14062 


13181 


MJECSll 


14404 


15030 


MJECS12 


16547 


15411 


MJECLOl 


275 


1048 
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MJECL02 


1474 


1085 


MJECL03 


1700 


1377 


MJECL04 


1865 


3250 


MJECL05 


3235 


3450 


MJECL06 


4170 


3787 


MJECL07 


5844 


4561 


MJECL08 


7415 


5832 


MJECL09 


7780 


8103 


MJECLIO 


8107 


8784 


MJECLll 


8788 


9159 


MJECL12 


9150 


9887 


MJECL13 


10678 


12483 


MJECL14 


14468 


15427 


MJECL15 


15420 


16541 


MJECL16 


16599 


16811 


MJECL18 


20873 


21505 


MJECL19 


21456 


22019 


MJECL20 


22829 


23290 


MJECL2! 


24596 


23298 


MJECL22 


25120 


24854 


MJECL23 


27628 


25136 


MJECL25 


28835 


29167 


MJECL26 


30215 


29178 


MJECL27 


31077 


30571 


MJECL28 


35352 


31534 


MJECL30 


37621 


37151 


MJECL31 


37811 


37599 


MJECL32 


40153 


38828 


MJECL33 


41381 


40125 


MJECL34 


43121 


42231 
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MJECL35 


45007 


43115 


MJECL36 


45921 


45394 


MJECL37 


46065 


46865 


MJECL38 


47997 


47197 


MJECL39 


49387 


48329 


MJECL41 


53908 


52613 


MJECL43 


57371 


56187 


MJECL44 


58339 


57341 
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Genes of M. jannaschii that contain inteins. 


Gene 
No. 


Putative identification 


iHO* Ol 

inteins 


MJ0043 


Hypothetical protein {Bacillus subtilis) 


1 


MJ0262 


Putative translation initiation factor, FUN12/IF-2 family 


1 


MJ0542 


Phosphoenolpyruvate synthase 


1 


MJ0682 


Hypothetical protein {Escherichia coli) 


1 


MJ0782 


Tranascription initiation factor IIB 


1 


MJ0832 


Anaerobic ribonucleoside-triphosphate reductase 


2 


MJ0885 


DNA-dependent DNA polymerase, family B 


2 


MJ1042 


DNA-dependent RNA polymerase, subunit A' 


1 




DNA-dependent RNA polymerase, subunit A" 


1 


MJ1054 


UDP-glucose dehydrogenase 


1 


MJl 124 


Hypothetical protein {Saccharomyces cerevisiae) 


1 


MJ1420 


GIutamine-fructose-6-phosphate transaminase 


1 


MJ1422 


Replication factor C, 37-kD subunit 


3 


MJ15I2 


Reverse gyrase 


1 



PCTl.WPD 
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The 1 ,664,976 M jannaschii circular chromosome (SEQ ID NO: 1 ) has 
the following sequence: 



GGATTATTATGCTACTGGTTTTAAAATAATTGACTTATCTAAACTAAAAGGAGGAATTAA 
GAGAGAGTTTAACGCATCTAATAGAGAATTATATAAAAAGGATTTGATTATTTATGAAAA 
GGATTTAAAATAAATAAATTCGCTTATCTTCTCTTCAATTTTTATTACTCATAAAAATTA 
ATTTATGTATTTATTTATATATTMTGTTA7VATAAAGTAAGTAGGGGGAAATATGTCAAA 
GTCTGGGAATAAAAACCAAAATTGCCCAAAATGTAATAACAGCCCATGGATACAAAGAGC 
AAATAATTTTATTGCTCAAAATCAAAATGTTCAAACAGGTACTAAGGAATATTATCAAGT 
TGAAGCAGTAAAGTACTTATTAAATAATGGACATTGTGGGATAGATTGTAGGGCAAAAAT 
TAGCGATATTATA?iAGGGAATAAATTATCCCAAAAATAGGGAAGCTTTCC7^CATGAAGT 
GTTGATACCACTAAAACAGTATGGCATCATAGCAACATTGGTTTATCCAGGACGTAAAGG 
AGGCGTATTTATCCCATGTAATAATGATGAAATAAAAAAAGTGGCAAAACAAGTGTTTAA 
GAGGATAGAAAGTGAATTAGAAAATTTAGAAGGTTCTGCGACAGGAGTTCAAAATATAAA 
AAATTTAGCAAATTCTCTAAAAACGACTGTTCACAATCTTAAGAACACTATTTAAATAAA 
TGCATCAAGAGTAATTATGTTTTTGTTTTTTACATTATCAAATTTTCCATCTGTTTTTAA 
AAGTTCTTTTTTTATCCTCTCCTCTGCAACTCTGCAATAGTATTCATCAATCTCAAAGCC 
AATATAATCAATCCCTAACCTAATACATGCTATTGCTGTGCTTCCAATTCCCATAAATGG 
GTCTAAAACAAGATTTGTCTTTTTAACACCATGCAATTTAATACACATCTCCGGAAGTTT 
TGGAGGAAATGTTGCAGGATGAGGTCTTTCTTTTTCTTTTGATTGGATTGTTTCATAAGG 
GATAAACCACGTATTTCCCCTATCTCTTAAATCTCCTTTTCTGTTAAATCTCTTTATATT 
GCTTTTATCCTGATAAGG7VACACCAATTGCTAATTTGTCTAACTTAACGTTCCCATTTTT 
TGTGAAGTGGAAAATATATTCATGCATTATACTTAAAAATCTATCACTGTTTATTGGCTT 
GTAuATGTCCAACAGCAATATCTCCAATAATATTTGGGTAATTTCCAACATCTTCTTTTTG 
TATTGCAATTGATTTTACCCAATGTATAGTATTTTGTAATTTAAAATGTTTTCTTATAAC 
ATTAGCAACATCTVAAGGCAATCCACGGGTCTTTTGCAGTATAGCCAACATTTATAAAAAA 
TGAGCCGTCATCTTTTAATACTCTCTTTATTTCTTTGACAACTTCTTCAATCCAATTTAA 
ATAATCTTCTCTACTTAAATTATCAGAGTATTTGTTGTATTTTATGCCAATATTATAGGG 
TGGAGACGTAACAACAACATCAACTGTCTTATCTTTTAACTGTTTCATTCCCTCTAAACA 
ATCCATACAGTAGATTTTATTTATCTCCATTTTTAATCCCCATCATTATTTATTCTATCA 
TCAATTCTGCAAGCTTCTCTACTTCTTTAATTCCCCTATCAAAATCATTTAAGTTTAAAT 
ATTCTTCTTTAGAATGGGCAAGCTCTAATTTGCCAACACCATAAATAATAGTATCTGCCT 
TTAAAAATTTGTTGAAGTAATATGCTTCGCAAGTAGCATTAAAAAATGATATTTTAAAGT 
GCTTAGACAACTTATTTATTAACTCTTTATTTTCAAGCATGTAGAAATTAGCATAATGTC 
TTTCAGGATTTAATGAGCTTTTTATATGCTTTGAATAATTTTTTTGAGATAAAAAGTCGT 
CTATCTTTTTTATTATATCTTTTTCAACACTTCTAACATCAAATAAGACATAAGCATAAT 
CTGGAATGATATTGCTTTGAATTCCTCCTTTTATTATGGTTGGAGTTATTGAAGAACTGT 
AGATTTTATCAACCTTAATCTTTTCCAAAGGAAGATTTTTTAAATCT7VAAATAACTCTGC 
TTAAGATTTCTATTGGATTTAGGCCTTGAGATGAGGCATGCCTCGCCTCCCCAAAACTTT 
CAACAATATACTCAAATCTTCCTTTATGTCCAATACAAACATTTAAGTCAGTAGGCTCTC 
CAACTATGCATTTAATACCTCTTTGAATTTTATTTTTATTTCTTAAATATTGGCAAAAAT 
TGTAAATACCATTTGATTCTGTTTCTTCATCAGGAGATATAACTAATAGAGAGTTATTGC 
TATTTAAAAAAGCATGAATCATTAAAACCACATTCCCTTTAGCATCTATAACTCCAGTCC 
CATAAAAATTGTTATCATCTTTTTTAAAATTTGATTGAATCTTTACAGTGTCTATATGTG 
AATTTAATATCAAATCAAAGTTTTCTTTTTCTTTATATGCTACAAAGCATCCTTCAATGA 
TAGTATTTTTTATTCCTAAGTTATTGAAAAGATTAGATAAATATTTAAATGCCTTTTTAA 
CACCAATTCTATTATCCGTCCTAATTTTCACCAAATCCTCTAAGATTTTTAAATAATCCA 
TAATTATCATCTCATAAATTCTACTTTTTCTCCAATAATTTCATTTAAATCAATATCACT 
ACACTTAAATTCAAGCATTGCTGTTGAGTAATTTTTACATTTGTAGGTTTTCCATGGCTT 
TAATCTTACAGCTTCGACAACCCTATTTTTATCAATAAAAATTATATCAATAGGATAAAG 
CATAAAGAATGTATGCATAGCTATCTTTCTTCTTTTGTATAGG7Uyy\GCATAGCTTTATC 
TCCAATATCTCTAAGCATTAAACCAAAAGCTCTTTTAATAAAATTATCTGCCAATACAAC 
TTCAAATTCTT^AATTTCCAACTTTAACTTTTTTAATTTTCTTATTTTGCATTTTTTTCAC 
TTTCTTTTTTGCTGTATGGGACAGGGATGTAATAAACTGAAGGTTTGGCTCCCATTGGTT 
GTGGATAAAGCTCTAATAACTCATAAACCTTTCTTGGAACATTTGTATTAACTTCAATAC 
CTAATTCTTTTAATTTACTAACTGTTAAAGGGTAATCATGTGTCCATGTTCCTGAAGTTA 
GTTTTTTTGCGATTTCTTTAGCTTTTTCATCTCCATATTTATCTTTCAACAACTCATAAA 
CATy^TTCTTCCATCTGTTTAATAGCTTTTTTAGATATATCAACCAATATTAATGTCTCAT 
CACTTACTTTTTCTCCCTTCCTATAGTATGCCTCTTU^GATaGATGCAGCAGGATACTGCC 
CAATCTGTGGATCTACTGGCCCCATTACAGCGTTTTTATCCATAATTATTTCATCTGCAG 
CTAAGGCAATTAAACTTCCTCCACTCATCGCATAATGTGGAATTATAACTGTTGTTTTTG 
CCTTATGTTCCTTTAAAGCTAAGGCTATCTGCTCACTCGCTAAAGCTAAACCTCCAGGAG 
TATGAATGATTAAATCAATAGGCATATCTTCTGGTGTTAATCTAATAGCCCTcAAAATCT 
CTTCACTATCTTCAATAGTGATAAATTTATATATTGGTATCCCTAAGAATGTTAATGCTT 
CTTGTCTATGTATCATAGCTATAACTCTTGTTCCCCTCTGTCTTTCAATCTCCCTTATAC 
ATCTCAACCTTTTCATTATCCTATATCTCATCATCATCTCTGGATAAATAAATAATAGAA 
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ATATGAATAAGAAGAAAAACATATCCATCGATGTCATTTCATCCCCCATTATTTTTGTAA 

GGTAAATTATTAATATCACTTCATGAATATAAATATAGTTGCCTTATTAATAGGACTTTC 

GCAGGAAAAATATTTTTATTGAATATTGACACTCTTTGAGTGTCTAAGCTCCAAATTTAT 

ACATAAACTGCGAAAGTCCTATTTATCATCACTTAAACTGGTGATTGACTATGAGTAAAA 

TTGGATTTAATCCAATAAAAATAAAATCTTTTTCAAAGATTAAAACTTACGATGATACAT 

TACCATCATTAAAGTACGTTGTATTAGAGCCTGCGGGATTCCCAATCAGGGTTAGTAGCG 

AGAACGTTAAAGTTTCTACTGATGATCCTATATTATTCAACATCTATGCGAGAGACCAGT 

GGATTGGCGAGATTGTTAAAGAGGGAGATTACTTATTTGATAACTCAATCCTTCCAGATT 

ATGCTTTC7VAGGTTATTTCAACTTATCCAAAAGAGGGAGGAATGATTACAAGCGAGACTG 

TCTTTAAATTACAAACTCCTAAAAAAGTTCTTAGAACACAGTTTAAAAAAGCTAAGTTCA 

GCGAGATTATTGGGCAGGAAGAGGCAAAGAAGAAGTGTAGAATTATTATGAAGTATTTAG 

AGAATCCAAAGCTCTTTGGAGAATGGGCTCCAAAGAATGTGTTGTTCTATGGTCCTCCAG 

GAACTGGATUVGACATTGATGGCAAGAGCTTTAGCTACAGAGACAAACTCCTCATTTATAT 

TGGTGAAAGCTCCAGAGCTTATTGGAGAGCATGTTGGAGATGCTTCTAAAATGATTAGGG 

AGTTGTATCAAAGAGCATCTGAGAGTGCTCCATGTATAGTGTTTATTGATGAATTGGACG 

CTATAGGATTAAGTAGGGAATATCAATCATTGAGAGGAGATGTTTCTGAAGTAGTTAATG 

CACTATTAACTGAATTAGATGGAATTAAAGAAAATGAGGGAGTTGTAACTATAGCAGCGA 

CAAACAACCCAGCGATGTTAGACCCAGCAATTAGAAGTAGGTTTGAGGAAGAGATTGAGT 

TTAAGTTACCAAATGATGAGGAGAGATTGAAGATTATGGAGCTTTATGCTAAAAAAATGC 

CACTTCCAGTTAAAGCTAACTTGAAGGAGTTTGTAGAGAAAACAAAAGGATTTAGCGGTA 

GAGATATCAAAGAGAAATTCCTAAAGCCAGCGTTACATAGAGCAATATTGGAAGACAGGG 

ATTACGTTAGCAAGGAAGATTTAGAATGGGCGTTGAAGAAAATATTAGGCAATAGAAGAG 

AAGCTCCACAACACCTCTATCTCTAATCCTCATAATCAAAGTAATTATCATAATACTCTA 

TTAAATAATCTCCAACAATCCATAATTCTTTTTTATGCTTTCTATATAAATTTAT/^GCT 

TTTTTATTGCTTCTTTATTTTCTCTTCTAAATATTTCGTCTAATATTATGGTTAATGCCT 

CAATAATATCAGAATTATTAAAATCCAAATCTGCCCTTATCATCTCATCAATAACCTTTA 

TCAACTCCTCATCATCAGCATTTTTAACAAACTGATTTCTTAAAAATGATTTAAATGTAT 

AAATATTTTTTCTCTCTAATGGAATTAGTTTTATTAAGCCAATGGCTTTATTTAATAAAC 

TGTCTGTATTTAAAAACTCATCCCATAGATAATATTTAACAACCAAATTTTTTAAAATTT 

TTAAAATCTCTTCTTTAGGTTTATCTTCTATTTCTTTAAATATTTCTTCTGCATCCACAT 

AGTTGTTGTTTTCATAAGCTTCAATTAAAGCATAGGCATGTTTGCAGTTGTATTTGTATT 

GGCAGGTGCATAATCCAAAATAGTTATTATCTAAATCAACTTTAACTTTATAAGTATCTG 

AGCCAACAACCTCCCCAAATAAAAAATTTTTGTATTTTATGCAGTATTTGACTAAATTGT 

TTCTATAATATAGCTTTCCTCTTTCTATTATTTTTGGGTCGTAGTTCATGGTTATCACAA 

AAATTATTTAAATTTTTTATAAATCATTTCAAAAAATATCGGCAGAATATAAAAAACTAC 

AGTAAATCCAGCAATAAAGCCAGTTATAAACCTCAACTCATTAAAACTTTCTCTCAATCC 

AATTAGTTGAGTGGTTCCATCAACTGCCATAGGAATTAATGCAATTATTAAATACCATTT 

ATTAGGGATTTTAAAATCATCTAATTTCTTAATAAATGGATAAATAATCATCCCTACTAA 

AACCCCTGTATAAATCCCAAAACATCTTGCACACACGGCCATTTTATGTCCAAAGATAAA 

AAAGCTTCTTTGTGGCATTTGATGGCATATAAGGGAATAAACAGCGTATAAACATATTGA 

AATAAACTTCCAAAAATTTGATGTTTCTCCCAAATATGCAAAATAAGGTGCTAAAAAAAT 

ACTCAAATAAAAAATAAGAAAAGAAATAAGGACTATTAAATAATATTTTTTCATAACCCC 

ACTTATCTATTCTTTATAACAACATATATAACTCCACCAACAGCCCCGAGTATTGCTCCA 

AAGATGATTGCTGTAATAAATCCAATAATGAATGAGGCCCCAGTAAACATCGCCGCTTTT 

AATCCAAGCGCTGATAGGTATGCAGACATAAATAAAAAGCTTAGAATTGAAGCGATAACT 

CCCCCTATAACTCCGGATATTGCTCCAACTAATCCACAGTTTTCATAATCGCAAATACCT 

CCAGCATTAACATAGAGATGAGCGGCAACAGCACCACCAATAACATAACACAAACAACAA 

ATAGCCCCCAATATACCGTTTATAATTCCTCCAATTACCGCTGGTTTTAACATTCTTTCC 

TGGTCAAAACTTACCATATATTTCACCGAGTTTATTTTTAATTATAATTACATTAAAATT 

TTTATTTTTTGATTTTATATATTTTTCTATTTTTATATTATTAACATTTACATCCATAAG 

CTTCATACAAAGTTCCATTAATCACTGTATAGATTGGAAATCCCTTAACTTCCCATCCGT 

CAAATGGACTAAATTTTGCCTTTGATTTAAACAGTTCAGCATTGATTTTTCCTTCTTTTT 

TTAAATCAATAATTGTTAGATTTGCTAAATTGCCTTCTTCAATTTTGTTGTTTATGTTAA 

ATATCTTAGCAGGATTTTTTGATAAAACCCTTATAGCATCAAACAAACTTATTAATCCTT 

TATTAACTAAATTTAAGGTTAAAGGAACTATCGTCTCAATTCCTGGAATCCCCGAAGGGC 

AGTTTTTGACATTTTTAAGTTTATCCTCTAATAAATGTGGGGCGTGGTCAGAGGCAATAA 

TATCAACATCTTTATTAACAATTCCTTTAATTAAAGCGATATTATCATCTTTTTCTCTTA 

ATGGAGGGTTAAACTTGCCAAAACCCTTTAACTCTTCAGCCATGTCTTTATTTAAATAAA 

TATGATGGGGAGTAACTTCAACAGTTATTTTTATATTTTTTAACTCTTGTCTTACTTTTT 

TTATTAAATATAGAGCTTCTTTAGTTGAAATATGGCAAAAATGGACATGTGGTTTTTTAT 

TACTCTGCCTATCAATAATCTTTAAGTTTTTTATAACTTCTTTAACTGCTTCAACTTCTG 

A'^TTTTCATCCCTAATTTTACAATGGTCTATCCAGCTGTTTAATTGATATTTCTTTAGAT 

TTTCATTTATTACATCTTTGTGTTCAGCATGGATGCAGAAAAGCTTATTTTGATTTAAAA 

TATCTTTTAATTTTGAATAATCCTCTATAAACAAATCTCCAACAGATTTAACCATAAATA 
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TCTTGTATGCTTTTGCATCTTCTACAGTTCCAAGGTAATTATTTTCAGTAACTCCAAAAT 

TCAAAAACACATTTATCTTACTATCCTTTTTACAATCTTCAAGTTTTTTATA/iLAATAGTT 

CTTTTGTAGTTATTGGAGGTTTATTATTAGGCATGTCTATGGCAAAGCAAACTCCTCCAT 

TTATTCCAGCTAAGCTACCACTTAAAAAATCTTCCTTCTTTTCCTCTCCCCATCTAA7VAT 

GAACATGTGCATCAATAACTCCCGGAATAACTAAGGAGTTTTTTATATCTATTATTTCAT 

CATCTACTTTAATATCTTTGGCTATCTTTTTGATTCTACCATTTTCATCAATTAAAATAT 

CTCCTTCAATGATTTTGTTGTCTTTTATTATTCTACAGTTTTTTAATAGCATGGTATCAC 

ATCTTAATTTATTAATACAAAACTAATAAAAATAAAAAGTATTAAATAAAAATATTAACC 

ATCTTTAAGAGTTTAGAGGCTGGTAGTTATGCAATTGGGAAATGCAGAAGTATTTTATAT 

AGCTATGGGAATTTATCTATTTTTATTATTTGCTATTGCATTTATGACTTATAGATGGGT 

TAATAAAGAAGTAAAACCAGCTAAAACATAACTTCAAACTTTTTTTAATTAGCTTACCTC 

CTAAAATCCAAAATATAACTATGCTAATGGCAAATATAAAGCATAATCTAATAGTAATGG 

CTTTTTTATACACAAATAAGTTAAATGTGCTTAGTAATAGACCATAATAGCTATGTTCAA 

CTAACACATTAAATATTTTTCCAACTACATAGTTACGAAATGCTATTCTATCCTTTATTG 

TTATTTGCCTATGCATTTCAAAACCATAACCCAATATGAATTGAAAAGCAAACAACAAAC 

CTAATCCAACAATAACCATAAATATAATGCATGGAATAGTAGGATACTGCAGAGGCTGAA 

ACATTATACCAATAGCTATAAGCAAATAAGTTATTAAGTTATTAAAGTTAGCTTTATTAT 

TGATACTGCATTATCTATGTAATATATTATTTTATCGTTCATGAGCATCACAATTGGTTT 

TTTGACTAGAATT7VAATTTATAACAATCTATACCTCCCTGGTTTTGGTTCATCTATATCT 

CCATATTTAATTAATTTTTTAATAATATTTTCCAACTCATCCTCTTTAATACCTTTCTTT 

TTTGCTTCTTCAGCTATATCTTCATGTTCAACAAGTTCTGATTTTTCAGATAACTCCTTA 

ATTATCTCATAGACGGTTGTTAATTTGTCTCTCTCTTTCTTAGACACCCCTAAAATTTTA 

TCAACATCAAATATTCCAGTCTCTGGGTCATAGGCAATTTCTTTTAAGCATTCAGTTATT 

ATATTTATTGCCTCCTTTGCATCTTCCTCATCAACAACATCCTTTAACTTTGCCTTTGCA 

TGAGCTTCAGCAATCCTTATAGCAGCCTCTAACTGCCTTGCAGTTATCTGATGTTTTTTT 

CTCATCTCTACATAATAATTAACAAATAATTCCTTAGCCTTTTCACTAATTATCGGCTTT 

TTCTGTCTTGCGTAGTAGATATATTTTATTATAAATTCCTTGTCTATTTTAACTCCATCA 

ACCTCAAGGTAATCTAAACCCATCTCCCTGTTTATTTTCTCATCTAAATATGCTCTATGC 

AAATCTACAATGTATTCAGCGATATCTTTATCCTTATCCTTATCAG7WVCATCTCTAATT 

GGAAATATTAGGTCAAATCTACTCAATAATGGGGCTGGAATATTTATCTGCTCAGCTACA 

GAAACCTCTGGGTTGAATCTTCCCCATCTTGGATTGCAAGCGGCTA7VAATTGCACATTCA 

GCTGGAAGTTTTGCATTTATTCCTCCTTTACTAATATGGATTGTCTGACTCTCCATAGCC 

TCCAAAACATAGCTCTGCAGTTCTTTATTAACAGTTAGCTCATCTATACATGCAGTTCCT 

TTGTGGGCTTTAACTAACAAACCTGGCTTAATAACCCATGTATCTTCACCAATCTCTGTC 

TTCTCCCTAACAACAGCGGCAGTTAGCCCAACACCAGTGGCGGTAGTAACAGAACCGTAT 

AAATTTCCTGGGATTTCAGCAATCTTTCTTAGTATGACTGTTTTTCCAATTCCTGGGTCT 

GTGATTAATAATATATGAATATCAGCCCTCTTTCCAGGTTTTTTAACTCCCTTTATCTGT 

TGTAGTAAGACAGCCTTCTTTATTGCAGAATGCCCCTTAATCTCTGGAATTAATCTATCT 

GCAAGTATATTAACAACATCTTTTCTTTTAGCTATTTTTTTAATATTTTCAATATCTGAA 

TTTGTTAATTTAATTTTTACTTCCCCATCCAAAACCTCACAGTGTAGGGCTTTAACATGT 

ATGTCATAGATTGGTAGCTTTTTACTCTTCTTAACTTTTATTGGGATGCCAGTTATCTTC 

ACCCTTCCAGCATATATTCCAGGACTGTTTTCTAAGAACACAGTTATGTATTTTGGCGGC 

TCTTCAGGATTTTCCATTAAATCCAATGGCTGTTGAACTTTAATCTCTTGGAAGTCAGTA 

TATATTGATTTATGCTCAATTAGGTTTAACTCAGCTCCACATTCACAAACAGCTTTTTCA 

GAGTCAGTGTTTAAGATATCTATTTCTCTAACAACTTCTCTTCCACATTTTGGACATATA 

TAATAAGCTTTTTTAAGCATTGGTCTTATTTTTGATGCCATAACAATGATTCCTTCAAAT 

TCAACTAATTTTCCTAAAGTTTTGCTCCTAATATCCTCTATTGTGAAAATTTTCCCTTTT 

CTTGTAGTTTTAAAAATTTTTGGGAGATTTTTTACAGCAATTATTACGTTTGTTGGATAT 

TCATTTCTTAAGGTGTAATAAGCATCGTTGTAGCACTCTTTTATAAAATCAATCCCTTTT 

TGTGGATTATTTATTAAAAATTCTACAAATTCCATTAATCCGTAATTGTAGAGTTGATTT 

AAATCAACTACAACTCTTTCATTGTCTAAGATAATATCTTCCTGATGAATATTTCTTAAA 

TAGGCAGTTAAATAATCCCTAACTTCCTCTAAAATTAAGTCTTCATCTCTCAATTCCATA 

TCTACATCCCCATACGAAGAAATCA7VATTTTAGAGAAATTTTAAAACGAAAAATAGTTGA 

AATTTTGCTTTTAATCTTAATTATATTATTTAGTAGTTGTTCTATTTATCTATTCTGTCA 

TTTATTATAAACTATTTATATAATTAACAACCTTTAAAACTCCCATGGCTATTCCTTCAA 

CCTCAATTCCACCCCTTCCTTTAGCTCCATCTCCAACTAAATAAAATCTATCATTAACAA 

TATTGTCAATATCAGTTCCATTAGATGCATGATTTACAGGCCAATCATCCCTATATGACT 

GAATATGCAATATTTTGTAGTCTTTTCCTTTAAAGAGGTTTTCAATATCCTCTAATCCCA 

AATCAATTTCCTTTTTTACATTGTTGGTTAGCTGTGTTTGATGAGTCATAACTAAATGCC 

ATCCTTCAGGAGCTAAGGATTTATCTACATTAGTTACTTGGTTTAAGCCGTTTATCCTCT 

CACATTCTGGGGTAAAGAGAACACCACCATGTTTTATAATTCCTTCTTTTGTGGCTATGC 

TTATCTTTATTCCTTTAGATGGCTTTGGCTTTGATTTCAAAAATTTTATATTGCATATTT 

TCTGGGTTTCAATTGGAGAGATGTTGCTTATAACAACATCGAATTCATAGTCATCAATAT 

AAGCTTTTTCATCAATCTCAATCCTTTTAACTTCATATTCCTTAATAATTTTTCCATTGT 
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TCTTTTTAATAATCCTCGAAAGTTCATCAGTAACTGCCTTACATCCACCTATTGGTATTC 

CAGGTCCTCCAAATTTGTGGTAGTTTTTAGCTATCTCTATAATTTCACTCATAGGTGTTT 

CATAAGCTGTTAAACTCAAAGCCCATCCAGTAAATGCATTTCCAACCTTTAAAGCTAAAT 

CAATCTCTTCTAAAAACTCTCCAAATGAGATATTTTTATCAACTTTTCCCAACTTTAATT 

TTGTAGCTAATTTAAATGCTTTTGCCTTTTCTTTAAAACCTAAGAGTGAAAACAGCTCTT 

TATATAAATACTCCTTCCCATTAATTAAAAATGTTCCATCTGGTTTTGAGTTTATTATTT 

TTACATTAGCTCCAGCCTTTCTTAAAGCTTGGGCTAAATAGCCATCATTTCCGTGTGGTA 

TCATGTGTAAAGCTCCTGTTGTTAGTTGAAAGCCCTCATACTTCAAGTTTGTAAATCTCC 

CTCCTAAGAATGGAAGTTTTTCAAATACAACAACTTCATGATTCTTAGATAACAATGCTC 

CAGCTAATAATCCACCTAATCCGGCTCCAACAATACCAATTCTCATAATATCTCCCTTAT 

TTGTTTATAATTTCCCAGTTTTTAAATATTTTGATATCTTTTGAGCAATTATTTCCTGAT 

TATGCCATGTTCCGTGAAATATACCCTTAGGAATCTCATACCTCATAGCTCTTATTATCT 

TTTTATCAACTTTCCCAGGATTTGCCTTAGCAAAGGGTGTTTGTGGTAAAGGCATAAAAG 

TATGAGCATGTATTTTAGCACCCATTTTTATTAAATCCTTCATAACCTTTATTGTCTTTT 

CTACATCTTCCTCAGTTTCTCCAGGCAAACCAAAAATAAAATCTACATCTACTCCAAGTC 

CAGCTTTTCTCGCTACTCTTACAGCGTTATAGACATCTTCAACCGTATGTCCCCTATGGC 

ATAGTTCTAATACTTTTTCACTACCAGATTGAGCACCAATAACTAAATTCTTATTATCAG 

CATATCTTAAAATTAAATCTACCGTCTCMTATTCACATGCTCTGGTCTAACTTCAGAGG 

GAAATGTTCCAAAAAATATCCTTCCATTATTACCTAAAATTTCTCTAATACTTTCTAATA 

GTTTTTCAATTTTATCAATATTTAATGTTTTTCCGTCTTTAGAACCATAGCCAAAGGCAT 

TTGGAGTTATAAACCTTATATCTTTCAAATTCCTTTCAGCCATTATTTCAACATATTTAT 

ATATATTTTCAACATCCCTATGCCTTATCTTTTTTCCAAAGATTCTTGGTGTTTGACAGA 

AATAGCATTTGTAAGGACAACCTCTCGTTATCTCTATATGTCCAAATTTATTATGCTTTA 

CAGGAAATGGTGGATACTTATTTAAATCAACAGGTTTTCTTCTTCCAGTGTAAATAAATT 

CATTATCATTTAAATAGGCAATACCTTTAACTTTTTTATAATCCTCATCTTCATTAACCG 

CCTTTATAAATTCTGGAAACGTCTCTTCTCCCTCTCCAATGCAAACAACATCAAATCCCA 

ATTTTAACGTTCCTTTTGGGTCACCTGTTGGATGAGGTCCTCCAGCTAAATAAATAATTT 

TATTCCTATAACTTTGATATTTAGCTTTTAATTCATTAATTAATTCATAAGTTTTCCAGA 

GTTCAGTTGTAAAGAAAGATATGGCAATAACAACCTTGTCATATTTTTTTAAAACTTCCT 

TTAAATTAAAAATATCTTTTTTATTGGCAAAATATATTGGGAGGTTATCAAAATATTCAT 

CAATCTCTAAAGCTCCAATCAATGCATTGAAACTGTTTTTATGTAGTTTTGTATAATAAA 

CTACCAAAGCGGTGTTTTCTTCCATATTGCTCCCTAAACAATATTTATCTCAAATGAGAT 

AATTAACAAAAAACTATATTAATGATTCCTTTAAAAGCTAAAGTATAGAATAAAATTTTA 

ATGCTAAAAATTTTTTGGTGAAATTTATGGCAATTGGGACACCTCTTTTGAAAGGAAGTA 

TAAAATTTTTGTTGTTAGGAAGTGGAGAGTTAGGGAAAGAAGTTGTTATTGAAGCTCAGA 

GATTGGGAATTGAGTGTATAGCTGTTGATAGGTATCAAAACGCCCCAGCTATGCAGGTTG 

CTCACAAGAGCTATGTTATTGATATGAAAGATTACGATGCATTGATGGCAATTATTGAGA 

GGGAAGAGCCAGATTATATTGTTCCTGAAATTGAAGCAATAAATACAGATGCATTAATAG 

ATGCTGAAAAAATGGGTTATACTGTTATTCCTACAGCTGAAGCTACAAAGATAACTATGA 

ATAGGGAGTTAATAAGAAGATTGGCAGCTGAAAAATTAGGATTAAAAACTGCTAAGTATG 

AATTTGCAGATTCTTTAGAAGAGTTGAGAGATGCCGTAGAAAAACTTGGCTTGCCTTGTG 

TAGTTAAGCCAATTATGTCTTCATCTGGAAAGGGGCAGAGTGTAGTTAGAAGTGAAGAGG 

ATATAGAGAAAGCTTGGAAGATAGCTAAAGAAGGAGCAAGAGGAATAGGAAATAGGGTTA 

TTGTTGAAGAATTTATAAACTTTGATTATGAGATAACCTTATTAACCGCAAGAACTGCTG 

AAGGAACTAAGTTTTGTGAGCCAATAGGTCATGTCCAAATAGATGGAGATTATCATGAAA 

GCTGGCAACCTCATAATATGTCTGCTGAATTAAAAGAACAAGCTCAAGATATAGCTAAGA 

AGGTTACCGATGCTTTAGGTGGTTATGGAATCTTTGGTGTTGAGTTGTTTGTTAAAGGGG 

ATGAGGTTATATTTAGTGAAGTTTCACCAAGACCTCATGATACAGGAATGGTTACAATGA 

TAACTCAAGAAATGAGTGAGTTTGAAATTCATGTTAGGGCTATTTTAGGTTTGCCAGTAT 

CAACAAAACTTATTCACCCAGGGGCAAGCCATGTAATAAAGGCAGAGATAAATAAATATG 

CTCCAAAGTATCATATAGAGGATGCTTTAAAAGTTCCAAATACTAAGTTGAGATTGTTTG 

GAAAGCCAAATGCAAAGGTTGGTAGAAGAATGGGAGTTGCTTTAGCTTATGCCGATTCTG 

TAGAGAAGGCAAGGGAATTGGCTGAAAAATGTGCTCATGCAGTTAGAATTGAATGATTGG 

ATATTTAGATAATATTTGTCTTGTTGAAAAAATTTAAATCTATGTTTAATTAGCTTATAA 

AATCTATTTCTTCATTTGAGAATTTTTTATTTAATTTCTAAGGGTTTGCTGGTTTGATTA 

TTTAGAATATTTGAGTTTATTAAATTATTTAGATTTTTAAAAATTGAGATTAATTAGGTA 

AGTAAATAAGATTTCTCTAACTAATAAGTTAAATTTTTGAATTTAAGGAGATATWUVTGC 

TTAGTTTTAGTAAAGAGATAAAATTTTAAATACTAAAAGGTTTATATTGTAAGATGGTTA 

TTTATCCTTAGAAAMTATGGTATAGAAAAGCTTAAATATTAAGAGTGATGAAATATATT 

ATGTTGTGAATGATTGCCCTGTTAAAATCAGACCTCTTGGAGGATGGAAATTTAAATGCT 

TTTACTAAATATTTTGTTAAATAATTCGTGTTAAAATCAGACCTCTTGGAGGATGGAAAT 

ACAAGTATATTATAAGTGTATTGGTAGTATATAAATTTTTGTTAAAATCAGACCTCTTGG 

AGGATGGAAATCTGTTTTATCAATTTTTCAGCTTCATCTGGTGTTATTATGAATAATGTT 

AAAATCAGACCTCTTGGAGGATGGAAATCTGCCCGCTCTTACCTTTCACGGCAATATAAG 
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CATTAAACGGTTAAAATCAGACCTCTTGGAGGATGGAAACGTTAAACAATCTGCTATGAT 

AATCATAACTAAATTCATTTGTTAAAATCAGACCTCTTGGAGGATGGAAACGAAGTATCT 

TCATTTACTATTACTAATTGATAACCTTGTGCATCTTTAGTTAAAATCAGACCTCTTGGA 

GGATGGAAACTTATCTCCTCCATTTTTATCTGTAAAAATTTTATTAAAATTAAAATAATT 

AAAATAAGACCGTTTCGGAATGGTW^TATAATTTAACTAAAAACTTGTATGCAACTGCAA 

CGTCATTTATTATTAAAATAAGACCGTTTCGGAATGGAGATTAGCAGTTTTGTCAGCTAT 

TCATATATAAAATAAAAATCTTTTGAAGATTTAGACTTAAACATTTAGTTTATTTTTTTA 

AAAGTCTCAGAGTTTTAAAATACAAAGTAGCAAATAAAACAAGCACTGGGATAATTTCCA 

ATCTACCAATCCACATTGCTATAATTCCAGCTATTTTTCCAATTACTGGAGTTTTTAAAG 

TAACTACCCCTAAAGATATGCCTATATTTGAGGTAATU^GAAACAGCATCAAATATTGAAT 

CGTAAGGGTTATAACCTAAAGCTATAAATATTAAAGCTGTTAAGAACGAAGATAAACAGT 

ATAAAAAGAATACAAC7UVATGCTTCCCTAATTATTCTATAATTTAAGTCCATATCATCAA 

GATGTTCATGAATCACTGCTGATTTTGGATAAATAATTTCTTTTATTTCATAT7VAAAGTG 

CCTTCAGTATAACTAAAAATCTAATTATCTTAACCCCTCCAGTTGTTGTCCCTGCCCCTC 

CACCAATTAGCATTAAAAAAATTATCAAAAATAGGGATAAGGATGAGAGATTACCTACAT 

TTATAGTTGTGAATCCAGTTGATGTCATTGCTGAAACTACTGTAAAGAGAGAATCTATTA 

TTGGAACTTTATCCTTTATTGAGATGATAATTGAAATAAAGGCAGTAACAATTAATGCAT 

ACTTTGTTTGAATGTCATTAAAATACTTGCCCGTTAGTAATTTGTGATGTATTGAAAATG 

ACATAACTCCTCCAACCATCATTATGCCAATCATAACAATTTTTGCAAAATCGTTGTATG 

GAAAGCTATAATTGCTTATACTCATTCCTCCAGTAGATATTCCAGTCATGGTTAAATTTA 

AAGCATCCCAAAAACTTAATCCAGATAAATATAAC/UWVGAACCCCTAAAATAGTGTATA 

AAATATAAATCCAGATAATAGTTTTTATTGTTCCTATAGCACTTGGCATTATCCTCTCTT 

GTCTCGCCTCAGATGTATATAAAAGATAAGCAACAGTTCCAGACCTTGCTAAGACAAGAG 

CTGATAAAACCAATATTCCAACTCCACCAATCCACTGCTGAAAACTCCTCCAAAATAAAA 

TAGATTTTGGTAAAACCTCAACATTAGGAATAAGAGTCATTCCAGTTGTTGTCCAGGCAG 

ACATGCTTTCATAAACTGCATCAACATAAGAAAAATAATCTATAGATAAATATAAAGGAA 

TGGCCCCTATAAATGAAGCTATAAGCCAAGCCAATGCAGAGGCAACCATGGTATGATGTA 

GTTTTAAATTTTTTGGTTTAGTAGCTCTCTTTAAAACAAATCCAAAAATAGAAAAAAATA 

AACCTGGAATTAAAAAATTTAAAAAGGTGTTTTCATTGTAATAAACTGACACTATACATG 

GAACTAATGTAAATATTCCAATAATTTGTATAATCCCCCCTAAAATATGTAT^TTCCTT 

CAATGTCTTTTTTTGTTAATCTACAGATTCCCATAATTCTCTAACCCATAGAAACATTTA 

CCTCGAAAAGTTTAATATCTTTTGTATGCTCCTTAATTAAAACAACATCCTTATCATACA 

TTTCTACCGAATAGTTTATTATCTCAGACATATTATAATATTTAAGCTTTGGATTTTTAT 

CAATGTAAGATTTGTTTCCATATTCTTTGAGCATTTTGTTATAAAACTTATCTGCTAAAC 

TTTCATTTTTAAATATCCAAAAATTGGTTATAATTACTTCATTTTTACCAAAATATTTTT 

TGCACTCTCCTTTAAAATATATGCTGTTGTTATTAATATAATCAACACGTGAAATTCCCC 

CTATTCTAAAAAATCCGTTGCTATTGGACTCATAATAACTCTTTGTATTCTCCTTAAGTA 

TTTTATAAACCTTATCTCTCAAGGTTTTATTGGAAAATAAGGATGCATTATATTTTATTG 

TAAATATCTCCATCTAATCCCAGAAATATCATTCGCTTTAATTATTAAGTTATTTCCAAC 

AATAATATATCTGTCAGAGAGTGCTTTTTCATTTTTGTAATAATAAACGCCCCTACAACT 

TTCAACAGAGTAGCCATTTTCTATAAGTTCTTTTAAGATTTCTGTTTCATTGAAATCTCC 

TTTTTCTATTACTACTGCATCACTACCTATTATAACAGTTTCTATATTGTTGATGTTCCG 

TTCCCAATCAAATCCGTCTTTTTCATAATAGAATCTATCACAAACTTCTGGAACAAAAGT 

TAATATATTTTCATCTCCAAAGTCATTTATTGTGCTTTTGTTTATCTCTACCTTTTTAAA 

ACTAAAATTTGATGAGCTAAAGTAGATTCCAAAAATTATAATTAAAGAAATCAAAAAAAC 

AATAACCGAAAGAGTTTTGTCCATATTAATCAACTCCAGCCTATAATCCCTCTTCTAACA 

TATCTTCCTCACTCAGCCCTTCTAACTCAATCATCGTTTTCAATCTTTCGGTAATTATAT 

TCAAACATTTATCAACAGTCTCACTAATCTTTATATTCTCAACCACTGGAATTCCCTTCT 

TTTTTGCAGTTTCAACCATGTAATCGTTTATCATTCTAATGATTTTAAAGTATTTTAAAT 

ACCTCTCAGTAGGTCTGCTTGAAACTCTTCCCCTTGCGTAGAATCTCATTTTATGCAACT 

CTTCATTGTAGATTGTTAGCATAATAAAAACTACATGGGAATTTTCTAAATATTTATCTT 

TT7UVAAGTGTTGGGACTAAGTGAGTTCCTTCGATAATTACACTCTGCCCCTCAACTAAGC 

ATCTATCTATAACTCCTTCCACTCCAGTTAATACTGCCTCAGAATGCCTCTCAAACCCTT 

TAATGTATTTATTGCCCTCATCATCTCTCAAAACCTTCCAAGCTGTATAACTTGATTCGT 

AAAGTGTAGGGATTAAATCTCTTGATATAACCTTTCTCATAACTTCCCTTATAGAATCAG 

TTCCAATAACGCTTGGAATACCCAATCTTGAAGCTATCTCAAAGGCAATAGTTGAAGTTC 

CAACACCACTCGCTCCACCAATTAAGATAACTATCGGTCTTCTTCCTAAAACCATTCTCC 

ATAGTAGATATTTTTTAGCAACTTCATCGTAATTTTTTGAAATTAAGTAATAATAAACTC 

TCCTCCTCAAATCAGCCTTATCTATAACTCTGATATTTTCCTTTTTTAACATCTCGTATA 

TATCCCAGGCTATTCTATAGGCAATACTTGGTTTTAATCCAGCGGCTGTTAAAGACCTTG 

CCAAAATACCCTTTGAAAATGGCATCTCATAGGATTTTCCCCTCACAATAATATCATTCT 

GCAAATCCATTATTCCACCGAAATTTAATCTAAAATTTCATCAGCATCCAATTTTTCAGC 

ATTATAAAGATTTTTAGCCCTCAAATAGAGAAGTTCATCTTTATCATCCTCAACCACAAC 

^^^^^^TTTTTTCTGGTTTGTATTTTTCTATAAATCTCTCAACATTCTTTCGTGCTATCTC 
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AACCTTTTTCTTTAATCCTTCTAATGCTTTTTCTGGAACGCCTATAGCTCTCATATCTTC 

AATATCTAACATTCCCTCAGTGCATACAACTTTTAAATTAGGATTTTTGTTTTTAAGCTT 

TTTAAATAACTTTTTATTTGAGACAATGTATAATGCACCTTCATCTATTTCTTTTTCTTC 

AAATTCAAACCCAAAGTTAGCTAAGATTTTATTTAAATGTTCTTGGCATTTTATAATCAA 

TTTACAGAACTCTTTAGCTTCCTCTTCATTCAACTCATGCTTTGGGGCTTTTTTATATAA 

GAAGTCATCTGCCTCAATCAACAAATAAATAGCTTTTTTAAATTCATTGACATCTATCTT 

CCCTGGCTTTGCATCTTTGTAGGATATTTTTTTATCATCCTTTTCTACTCTTATTTTAGC 

TTTTTTTAGTTGGGAAATAGTTGAAATTCCCTTTCTTATCAAATCCTTTGAATATTCAAC 

TCTCATCTTCTCCTCCTCTATTCTTCTCTCTTTTCAGATTTTTCCTTTTCTTTCTCTAAT 

TTTTCCAACTCCTCAACAAGCATTGGTAAAAAGACACCAACATCAGTAACTATCCCCAAA 

GCTTGTGATGTCCCTCTATCCATTAACTTTGTTACAACCGCTGGATTTATATCAACGCAG 

ATGGTTTTAACCCATGAAGGTAATAAATTACCTGTAGCTATTGAGTGTAGCATAGTAGAA 

AGCATTAGAACCATATCCTTTCCTTTTAAAAGCTCTCTCATTTTTTCCTGAGCTTTAACA 

ACATCTGTAATAACATCTGGTAATGGGCCATCATCCCTGATACTTCCAGCTAAAACATAA 

GGAATGTTGTTTTTTATACACTCATACATAACTCCTTCCTTTAAAATTCCCTGCTCTACA 

GCATCTTTTATGCTTCCAGCCCTCATTATTGTATTTATAGCCCTTAAATGATGACTATGC 

CCTCCTGGAACGCTCTTTCCAGTCTTTAAATCAACTCCTAAAGATGTCCCATATAAAACG 

CTCTCTATGTCATGAGTAGCTAAGGCATTTCCAGCAAATAGTGCTTGAACATACCCCATC 

CTAATAAGCTTAGCTAAAGCCCATCCAGCTCCAGTGTGAATTATAGCCGGACCTCCAACA 

ACTACAATTCCTCCTTTACCTGTCTTTCTATATTTTTCTCTAATCTCATACATCTCCTTA 

GCTATTCTTCTAATAATTGTTTCTTTAGGCTTTTCTGAGGAGGCATCTGATTTCATAAAC 

TCAAATAACCCCCCTCCTTCTCTTGGTTTTTCTGGAGGGATGACTCTAACCCCTTTATGC 

CCAACAACAACTAAATCTCCTTTTTTGATATTTCTTATTGTCTTTACTTCAGCCCTCATT 

TCATCTGGATA7VACAACGATAGCTCCGTCCATTTTTTGGTTTTCAACCTCTATCCATTTG 

CCTTTGAACCTAATAAATGTTTTATGATTGGTTGTTGAATAAAAGCCCTCTGGTAAGACC 

ATATCCTTCTCAGCTGGCTGTAACTCAACCTCTTCAATCTCTGGAATCTCAGCTCCTAAA 

TCCCTCAACTCATTCAATATTTCATCTACATGCCTTTCATCTCTACCAATAACCAATATC 

TTTGCATAACTTGGGTCTGTTTTTCTCTTCCCAATCTCAAACTCTAAAACTTTATAATCT 

CCGCCCATATCTAAGATTTTATCAAAAACCTTAGGCAGGATTAAGCTGTCAATAATATGC 

CCTCTCAATTCAATTTCTCTCATGAACATAAAAATCCCCCAATAAATGTTATCTTAGGAT 

TAATTAACGATGATGAAGTATTTAACAATTGTCATCAAAACCTTTATATACTATTTTGAC 

AGTTTTTAATCCAATTTTTATCTACTTTACAAAGAGGGATAATTTGCATACATTAAGATT 

TAAAAAAGATAGAGCGATAAAAATAAGTGAAGAGCTATTTCCTGATGAGTTATGTGAGAG 

ATGTGGAAGATGTTGCATTTTACACGCTTACAAAACTGAAGATGGAATTAAAACAATATA 

TTGTGAGCATTTAGACCCAGAAACAAAATTATGTAAAGTTTATAAAGATAGGTTTAAACA 

TAGATGCTTAACTGTAATGGAAGGAATCTTAGCTGGTGTTTTTCCAAAAGACTGCCCCTA 

TGTTAAAAATTTAAAAAATTATGAAGAGCCATGGTTTTATAGGCATTTGAGAGATTAGGT 

CTTTAAAAATTCATCTATTTTTTCAGCTAATGTGTCAAATATCCATTCAAACTTTTCGTC 

ATCTCTCTCTAACAATGTAACTCTAAATCCGTTAAGTTGAGAGCAGAATGAGGTTAGAGG 

AACTACACAGATTCCAGTAGATGCTAAGAGATAATAAACAAATTTCTTATCTATAGATGr 

ATCTTTTATTTGGTGTTCTATAAATTCCTTCAATTTCTCATTCTCTATTTTTATTGAATT 

^3;i^^^^^^'^^^^°TT^T^TTCAAATACAACAGACATATAGAAAGCTCCATTGGCTTT 

ATTTGCTATAACACCATCTAAATCTTTTAGTTTTTTGTAGGCTGTGTTTGACCTTTTTTC 

AAAGAACCTATTCCTCTCCTCTAAGTATTTTTTGTAATTTCTATGCCCCATAATTCTTGG 

AATAGCCATTTGTGGCAATGTAGTGGAGCAAACCTCTATCAATTTGGCTTTATAAATACT 

$^$^^^™TTTTTTAAATTCTTCATCCTTATCGGCATTGTAAATTTCAATCcJ??ScI 

TCTTGCCCCTGGCCATGGAAGTTCTTTTGATATACCCTTTAAAGATAAACCGCAGAcJJ? 

TTCATCACAAATAATAAATAAATCATATTCATTGGCTAAATCAACAATCTCATTTAAGAT 
TTTTTTTGGATATACTGCTCCAGTTGGGTTGTCAGGATTTATAACCAAAATTCCAC^AAr 

^^Sr^^^'^'^^^'^'^GTTACTGGAGGAGAGCCAGCATGGGATGCCTCTGCAGAAGAATG 
^S^^^^^'^^T'^ATGGGGATGGGTTTATAACTCTAACCTGCCTCTTCAATAAACcJ?^? 
CTTTGCAATGGCATCTCCTAAGCCGTTAAAGAATATGATGTCTTCAGCAGTTATcJ^JJ? 
TCCTCCTCTTTTATTTACTTGTTCGGCTAAAAATTCTCGTGTTTCTAATAAACC???^? 

aggacagtaggcataagaacagtcgtttttaacaatctctgctatIJ?™t?IIScI 
tttagcccctacatctattataggattcctcatgttttcatctcaaaatggaactctatt 

IISI^'"^'''''''''^''^^=^=^^^^"r^CCATTATCCCAGTAGTATATAAACT?TAC^ 
ACAAAAAACTTACTATAGAAAAGGCACTTATAAAACCAAAGACTTTTATATTCTTACCTT 

aaaaattgcagttaattttgaaaagcacgataaacgataattccctaaatatatggtgaI 

AACAATGAAATGCAAATTTTGTGATAAAAAGAGTTATATAllGSciJ???S?SG^ 
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GTATCTATGCAAAGAGCATTTTGTTGAATATTTTGAAAATAAGGTTAAAAAATCAATAGA 

TAAGTATAAAATGCTAAGTAAAGATGAAAAAATCTTAGTTGCTGTTTCTGGAGGTAAGGA 

TGGGCATGCAGCTGCATGGGTTTTGAAAAAACTCGGCTATAATATTGAGTTATTCCACAT 

AAATTTAGGGATTGAGGGATTTTCTGAAGAATCTTTAAAGGCTGTAAAGGAGTTGGCTGA 

AAAATTGGAAGTTCCTTTGCATGTTGTTAATTTAAAAGACATTACTGGAAAGACAATGGA 

GGATATTAGAGGTAAGAAATGCTCTATATGTGGAACAACTAAAAGATATTTAATGAACAA 

GTTTGGTTATGAAAATGGATTTGATGTCATCGTTACTGGGCATAATTTGGATGATGAAGT 

TTCCTTTATTTTAAACAACTTATTCAATTGGAATATTAGATATTTAGCTAAGCATGAGCC 

AGTTCTTCCAGCTCATGATAAATTTTTAAAGAAGGTTAAGATATTCTTTGAAATTGAGGA 

AGAGTTAATTTTAAAGTATGCTGAAGCTGAAGAAATCCCATATACAACCGTTGAATGCAA 

ATATGCTGAGAGAGCTATAACCTTAAAGCATAGAGCTTATTTAAATGAGTTAGAAAAGGA 

AAGGCCAGGTATAAAGTATCAATTCCTATCTGGCTATATGAAAAATAGGCATCTGTTTAA 

AGTTGAGGAAGAGGATTTCCAATTTAGAGAGTGTGAGGTTTGTGGAATGACATCTGCTGG 

AAAAATCTGCTCATTCTGTAGAGTTTGGAAGCTCTATAAGAAAAAGAAAGAAAATAGAAA 

TTAATTTATCAATTTTAGCCACCATGTATTTAGTTCGTCTAATTCTCTATCGGATTTAGC 

AACATAGGGGACATACCCTTTATCTAAGCTTTTATTAATAAAATCTTCAACTCTCTTCAA 

ATTTTCATTTGTTTTTGTTCCATCATCAACATAATCAACCACTAAAACAAACTTTCCAGA 

ATCTTTTACTTTATCCAACAATTTTATTCTTTCATTTATCTCTTCCTCTGTCTTTTGCTC 

TACACCATCATAAAACAAATCTTCAACAGCCCATCCAGAAACTGTATTTAATAACTTTCC 

ATGTTTATCGTACTCCAATAATCTTTCACCATTTTGTGGAATTATTATAAAGCTGTTGTT 

AGTTTTGTTTCTGCAGTAGTTTGATATCTCAACAATAAATTTAATCATCTCCTTTGCTGT 

AAAATCTTCATCATAGCCATTTTCTGCCCAGTATTCGAACTCATCAACCTTATCTAAATA 

AACTCCACAGAATCCTTGCTGAATAATTTTATCTAAATAGCTAAAAATTATTTTCTTCCA 

TTCTGGATGCCAATATTTCACAGCATAACAGCCCTCCCATTCTGGGTTTTCATCTCCTAA 

CCACTTTGGAGGATTTTTTAGCCATTCATTGTCCCAATAGAACCTATAATCTTCAGCCTC 

TCCAATGCTGATATAGGCAATAGGTATTTTTCCAGCTTTTTTAAGCTTTTCTATCTCTTC 

TTCACTATATTTTCCATTTTCAGTCCCATCTTTTGAATAATCTATAACAATTAAAGTAAA 

GTTTGAGTTTGCTATTTCATCAATATCTGCATTTTGAAGTTGATATGCCCATAAAAATTT 

TAAATTGTTAGAATTTTTGCTGATATTTGTAAGGTTTTCCGCATTTCTAATATTATTTTT 

AGATTTAGACATCATTTTAGGGTTATCTAAAAAAGTACTATCAAATGAAATAAAAAATCC 

TACAATTAAAATAATGCAAATTATTATTCCTAAAATATGGCTTTTCTTCATGTTCTTTCC 

CCTAATTTTATTTAAATGCACTCATTAACGTCCATGCCTCCTTTCCACTTATAAAAGCCC 

TATTAACCAATCTCTTAAAGATTATTTTGCAGAGTTCTTTTTTATGCTCTGGGATTTTTT 

CATTCTTGTCAATAAACTCATTAAATTTCCTTATCAATAGCTCTTTGTCCTCTTTTGATG 

CCTCTCTCATATTTATATCTAAAAATTTATTTCTAACTTTTTTTGTATAAATCTCATATA 

AAATAACTGCAACAGCATGAGATAGGTTCATAATTGGATACTTTTCAGATGTTGGTATTG 

AAACTAACAAATCACATTTATCTATCTCTTCATTTCTCAATCCATCATCTTCCCTACCAA 

AGACAATCCCAATGTTTCCCTTAACCTCTAAGATTTTATCTGCCAACTCTTTTGGTGTTA 

TTGGAACTCTCTTTAAATTTCTATCTCCTCCTCTTGCTCCTGAAGTGGCAATAACAAAAT 

CTAAATCCCCTATAGCTTCATCAAAGGTGTTGTAGAATTTGGCATTGTCTAAAATCTCTC 

TTGCATGGACTGCCATCATATAGGCTTCATTATTTATTATGCTTTTATCTCCAACTATTC 

TAAGCTCTTCAAATCCAAAATTCATCATAACCCTTGCTATACTACCAACATTTCCACTGT 

ATTTTGGATTAACTAAGATGACAGAAATCATTATTATCACTGTTTTTCCTTCTTTTTCAG 

CTTATAATAATGGTACTCAACGGTTTTTATATTAATCCCAGTAATTTCAGCTATTTCTTT 

AGGTTTTTTATCTAAATACTTTTTAATAATCTTATCTACATTCGTAGGTCTTCCAGTCTT 

TGCCTTTATTGGAATAACTTCGACATCAACTCCTTCCAAAGCCTTTATAATCTTTTTTGA 

GCTTCTTTTATATTTTGATTTTGGTAAATATATCTTCTTTGGCTCACAACTTTCCAATAA 

AGCAATAGCTACATCCCTATCTAACTCTAAGTTTATATAAATCTCTTCTTCATTTTCACA 

TTCTTTAATTTTTTCAATCAATTCTTCCTTTGTTTTTGCTATTAATTTTTTCATAATACA 

ATCACTTACTTATTTTTTCTTCCTTTTCTTCTTTAACCTCGATTTAACCATCTTTCTCTC 

CATAGGGCTTCGCCCTATTGGTATACCCGGGATGCACTGCCTCGTTTCACTCGGCAGTGC 

CTCTTATAAGTTATTTTTTCTTCCTTTTCTTCTTTAATCTCGATTTAACCATCTTCCTCT 

TTAAAGGGCTACCACAAATCTCACATATATCTTCTTCATAATCTACTGGATAAAGTTTTT 

TACAACCTTCACAAATCTTTCTCCAAATAAAATCTTTATTTGTTGGTTCAAAAGCTATTC 

CCCTAACTTCAATATTTAATTTTTTAGCTACATTTTGAATGCCATAATCGTCAGTATATA 

ATATGGCGTTTAAATTTAGAGCTAAAGCTAAGACACCAATATCTTGTTGAGACAAATTAT 

CTCCAGTTTTTTTAACAACTTCTTCAACCTTTTTTATATACTCCCTATTAGGACTCATTA 

TTTTTAATTTTCCAAAATCTAATGCTTGTTCAACAATAATTTTTTTTGATTCTATCTCTT 

CCAAAACTTCTGGGGTTGTGTAATGTTCCCCCTCCTCTATAATTGGGTTGTATCCATGAA 

TAATAGCTGAAGCATCCAACACCTTAACCTTCATGATTCCACTCCTATAAATGTTAAATA 

ACTGATAAGGAGATTTATTAATAATCCATAATTTATAAAATTCTGGTGGTGGCAATGATA 

ACAACTGTAGTTGGTAGTTATCCAGTAGTTAAAAAGGAAGAAACATTCTTAGATAAGGTA 

AAAAAGGTATTTGGCTTGTATGATGAATATAAATATGCCATAGAGAGGGCTGTTAAAGAC 

CAGGTTAAAGCTGGAGTTAATATTATAAGTGATGGACAGGTTAGAGGAGATATGGTTGAG 
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ATTTTCACAAACAACATGTATGGCTTTGATGGGAAGAGAGTTGtTGGTAGAGTGGAGTTT 

ATAAAACCAATAACACTAAAAGATATTTTATACGCTAAAAGTATAGCCAAAAAACTCAAT 

CCTAATGTTGAAATTAAAGGAATTATTACAGGGCCTTGCACTATAGCTTCATCTGTTAGA 

GTTGAGAGTTGTTATTCAGACAATAGAGATGAGAATCTAATTTATGATATTGCTAAAGCC 

CTTAGAAAGGAAGTTGAAGCATTAAAAAAGCATGTCCCAATAATACAGATTGATGAGCCG 

ATACTATCAACTGGTATGTATGATTTTGATGTTGCAAGGAAGGCTATTGATATAATAGTT 

GATGGATTAAATATTAAATTTGCCATGCATGTTTGTGGGAATGTTTATAATATTATTGAT 

GAGTTAAATAAGTTTAATGTGGATATTTTAGACCATGAATTTGCTTCAAATAAAAAAAAT 

TTGGTGATTTTAGAAAGTATGGAAAAGAAAGTTGGCTTTGGTTGTGTAAATACAAAAGTT 

AAGAAAGTTGAAAGTGTTGAAGAGATAAAAAGCTTGATAGAAGAGGGAATTGAAATATTA 

AAAAACAATGAAAAATTGAATAAAAATTTGTCTGATAATATTTTAATAGACCCCGATTGT 

GGAATGAGGTTATTGCCAATAGACGTCGCTTTTAATAAGTTAAAGAATATGGTTGAAGCA 

ACTAAATTAATAAAAATATAATTAATTTTCCTCTATAAGTGGTTTATATCCTGGCATATT 

TGGATAAAGCCAGTAGTCAGTTTTGTTAGTATATAATCCAATGATAGAATACGTACTATT 

ATAGACCAATATATATTTCCCTGGTTCTCCATAGTATATAGTCCCAGTAAATGTTTTTGG 

TTTTTCTTCTTTATAGTCGAGATTTGGATATTCATCAACACCTATTGGAGGGACGGTAGT 

TGTAATATCATTTATATCTTTTAAAAATATCCAATCCACTGTAACATTAGTATTTAAATT 

TATAAGAGCAGTTATTGAAATTGGATAGTTATTTCCCTTATTTCCATTAGTAAATGAGTT 

GCTATATTCAATATCCAAAGTGTCATTAAATATTGTAAAGTTTAAGTCAGTGGTTGAAAT 

TCTTTTTAAATCATAAATATAAAATTTATTTAAGTATTTATTTGAATTAGGAACATAATC 

CCCAATATCACTTCCACTGTATCCAACTCTCATATAAAGTTCTGGATTATTTCCAGTCCA 

ATCATACATATCCCAACCCACTCCATCGTTATCGCTTAATTGTGTAAAGAATCCTATAGT 

TTGGGCATGGGATGGAGTAAAGTTTGCtCTAAATATTAATTCATATCTAGTTCCATAAGT 

TTGTTTTGTATATACGCTTGAnCCTGCTCCTGCAATTACCGTTATTTTACTATTATnnAT 

GATAAAGTATCCAACAGAATCCCATTTATCTGGGTTAAAGTAATTGAaATCATCAAAGAA 

TATAAATGTGTGTtCTGGGTCTTGTCTATCTACCGGAGTAGTTGAATTGTAGAGTATGTA 

TATATACCCCTGCCCATTATTGTAGTTGTAAATTTCATTTTTATTTGCTCTAACCCAAAT 

TACTGATACATCGTTATTTCCTTCTCTCCAGGTTTGAACCCAGTAAGGTAAAAGAATAAT 

TTTGTTACTTACTGAATCCCAGCCAATTACTCTCAGCTCTGTTGGAGATTGAGGGTTATG 

CATTTCACTATAGTTAAAGTTACTACTATTTAATATTATACAGAAAGTACGGTTGTAGTT 

ATCATTTGGAAAATTATATATATTTATCTTTTTTTCATAACCCCAGGTGTAGTAAAATTT 

ATTTAAATAGACATATGGGTCAGGAATTCTTGATAATTTTATGTCTCTGTTGATGACTAT 

TGGCTTTAATGCGATTAACTCACCATTATTTAATTTTTTTGAATATTTTATGTCAATTTC 

ACAATATAAATGTACTACTAATGGGTCGTATGTAGGTGAAATTTTAACAGAACTAATGTT 

ATAGGATATATTTGAGTAGCCATAATTCACATTATTTAGTGATTCTTTCGTTTCATTTTT 

TATATAGCTGGTTATATACGCAACTGCCTCACTTGAAGCTGTAAAAAATTTTcGTTCTTT 

CATTATTTTATAGCTTGCATTTACAAAGGCATCTTCTACAATTTTATCTATATTTCTATC 

TATAGTATTTATTAAATTTTTTTCATATAAACTTACTTCTTTTATTTTTATTTCATCCTC 

TACTTCCTTTGTTTTGTAATCAATTGTTGCATAAAACACTGCAGATATCACAAACATTAG 

CATAACTAAAATTATCGCATTTTGAGAGAAATACATGGCAATCCCCTTAATTTCCTAATA 

TATATAATTCAACCCTTGAAGAGGATACATTTTTTGAAAGATACACAGGCATATGTATAT 

CGTAGTTTCTATATTTTAGATAATTATAGGCGTCATCATAGTCAAGAAACCTCTCCTTGG 

ATATATTTACAAAGTCCTCATTTCCATAAATTACATACCACCCTTCACTTCTGTTTAAAG 

TTAAAACTGTTAAAATATATACACTATTACTATTATTAACCCCATTGGATTTATTTATTA 

GAAGGTTATTATCTATATAAAGAAGATAATGTTTAAGTGGAATTCTTTCCTCTAAAAGTT 

TTTTTGAATCATTAACTCTATCAAAATAATATAAAAGAACAGCATCTTGCAAAGTTCCAT 

CCTCTGATAGATGTTCCATAGTGCTTATTCCTTTATCAAAAATATAATCAGATTTTACAA 

TATCCACATAATTGTTGTTATGTTCGACAATAGATACTGTCCAATATGCCATCCCTATG^ 

GACATTGTTAGGTGATAGTGTTAAATAGTAAAAATCAGTGTTTTCATTTGTTTTAATTAA 

TGGAGTTATATCTATATTGTCAAATACTGATTTTAATGGTTTTTTGTATATATATCCATT 

TACTGTAAAAATGACATCAGAAGATAGAAGCGTTACTGTTGGTTCTGGTTCTACATAC^ 

TCTAACAAATACCCAATCAATGCTAATATTTCCGTTTTGTTCGTCTTGAGGAACAGGATA 

AGTATATATGTTTGAATATATGGTTTTATATATGGCATCATCTATTATGAAATTCACT 

ACTACCTCCATCTCTTTGAATCTCATAAGTGTGCCAATTATCGTATAAAT^^ 

TAAGATAATGTAACTATCGTAATCTTGATTTAATACAGAAGATTCAGCTCTTAGCCACTC 

^^:™^^^G^CCC^CTCCTCATATTTCTTATGGAAATTTGCATGGAATCTTACAGMCT 
ATTTGGATAGTATGTACTAATGGAGGAAATATGCGTATTATTAATATCATAAAT^ 

ATTACCTTTAGAAAAATCATCAAAGAATAGGGGGAAGGTATTATCTCCATTTGCACT 
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TGTAGCTGTTGGATTTCCATAAAGCATATATATTAGCTTATGTTCATTTGGAGCTAAATT 
TACCTTAACCCAGGCGACAGTATGCGGAGTATCTATTGTATTTGGCTCTATCCAATAACT 
TAATGGATTACCATCTTCATCAACAAACCTTACATCTCCACAATCTGTTCTCATCTCTCC 
AGAATTTATATAACTTTGAGAATCAAAAACAATTTTTACATCATAATCATTTAAATTTTG 
5 ATTTAGGTTGTTTATTATTAATATTGGAGTAGCATACCTCCAATTCTGCCAAGTAATAAA 
TGGATTACCATTATTTGTAACAACCCTCGCAGATGCTGAAATAATTTCGCTATTGACTTT 
AAAGTTIAATGTGGTCTCCATTACAGCCATATAGATTTATAGAATCTCCATATTTCATTCC 
AGTAATTTTTGGAATATATACATTTTCTTTAAAGTAAATTAAATCCATCCCACTTATACT 
TAGGTTGGCGTTGTTAAATGTAGTATTTACATTTGAATAAATTTTAAATGTGTTGCTACT 

10 GAAATTATATGTTATAGCCTGAAGTCCATCATTTGCACAAATTTCGTCTGAAATGTTGTC 
AAGTTCTTCATCATATATGTTTGGATAAATTATAAAACGAATCAATCCCTTATATTCACT 
AAAATTATTAAATAAGTCGAAAGTTTTTTCTTTAATATCTAATTCATTTGTAAAATTATC 
CAAATATGTTTTGTTATAAAACTTTCCAGGAAATTTATTATTCCTTAAAAAATAATCTTT 
TAGTAACAAAGCTTTATGAAATTTTTCAGTATCCTTCTTTTCCTCTAATGCTGTGAGCAT 

15 ATTATGACTATAAACCATATACCCTATATAAAAAACACTCAAGAAAATGAAGGCAATTAC 
TATTGCCTCATAAGTAAATATATACCCTCTTTTTGAAACAATTTTTCTAuAACATATGCCA 
TCCCTCATTTAAACATAATTATTTTTTTCATTTTTAATAAATATATCCAAAACCAGTTCA 
GGAACAAAGCTATTGGGAAGATAATAAAACTTATGACATCATGAACTACATTATAATTTA 
TCATATTTGAATAATTTATTATTAATATTATCCTCAAAATATTTGAGATAGTAATAATTG 

20 AAAGTCCAAATACTGAATATGAGATTTTATATTTAATAGGAACATCAGGAGTCCCAAAGA 
TATAACCTAAAAATAAAGCCATTTCTAATGAACATGTGCATGGTGAGCTAATCTCTAT/U^ 
TATTTTTGCCAACTATAATTTCATTCTTGTAAAATTTCAAATTTAAAAGTTTAGATAGGG 
TTATTGTTAATAAGTCCATTATGTTTCCTTCTAACATTTTTAAAATGTAATAAAATATAA 
AAAAATATATTAAAAATCTAAGTATGTATATAGCATTTTTATTGCCCATTTAAGGCCCTC 

25 TTCACTTTGTATATATTGTCCAATGCTTTATTCTTTGCAGCTAAGGTTATATTTTTTGAA 
ATACTTTCTGCTGTAGAGGTCCCTCTCGTAACTGATTTTAGATAATAGAATCCAACTATA 
GATGCTGCAACTACAAGAGCACCTAAC7VATAATGCCAATTCTAATGATATTTGAGCTTTA 
TTAGATATTATTTTTTTAGGTTTCATTTTAATCCCCTTATAATTTGG/U^GAAACAGAATA 
TTGTTGAAATAACTATCAATATCTCAATATATGGCGGTATTGGAACGCTATGTATAAATG 

30 ATTCTCCTATATTTATTAACTTAGTTCTTGAATCTGAAAGATTATCATTATCTATAACTG 
TCAAGGTAACTGGATAAACCCCCTCTTTTTTATATTTGTGTATTATAATTGGATTTGTTG 
TTGTATTTGCTGGTGTTCCATCTCCAAAGTCCCAGATATAATATTTAATATATCCATCTT 
CATCGTATGATAAATTAGCGTTAAACTCTACAGTAGTTCCATTTATTACTTTATACGTAA 
AGTCAGCAACTGGAGGATATTTTGGAGGTGGAGAAATTATAACAATCTTTGTTACACTAT 

35 CCGTTAAGTTTGTATCGCTTTTAACAGTTAAGGTTACAAAGTATGCCCCCTCCTTGCTGT 
AAGTATGGATAGGATTTTgTTCTGTTGATGTGCTGCCATCTCCAAAGTCCCAGTGCCAAC 
TAATTATTTTCCCAGGGGCCACAACTGATGTATCTTCAAATCTTACAGTATTTTCATTTA 
TTATTTCATATGTAAAGTTAGCTAATATACCCCCAACTACTATTTGTTTTGATATTGAAC 
TACTTGCGTTATATTTGTCAAATACTGTTAAGGTAACTGTATAGTAGCCTGGTCTTTCAT 

40 ATTTGTGATG/^iACTATCGTATCTGTTGTATTGATAACGGTCCCATCTCCAAAATTCCAAA 
TGTAATATGCAATTTCACCCTCCGGGTCATAAGACTGGGAAACGAATTCTACATCCTCAT 
TAGGTTCAGGTTTATCTGGATAGTATATAAATTGAGCCACAGGAGGTCTATTTATCACAC 
TAAACTTAACAGTTGTTGAATTAACTCCTCCCATTCCATCCCAAACTACCAATTTAGCAG 
TGTAATTCCCTATAGGAAAACTTTTGGATATAATAGTTAATTCATTTGATGAGTAATTCC 

45 ATGCAACATTTCCATTAGAATCATAAACTGTTAAGTTAAATCCATATATTCTTGCCATGG 
GTGAGTTTGGAGATATAGGATAATACCCTATTAAGGTGCCGTAGTAATTATACTCTGGAA 
TCATTCTATTAGCATCTGGGTCATAACTATTTATTGGACTTVAAGGAAATTGTATCTTTAT 
AACTTGCAGGATTTGGATAAATATAGAGTTTGGCTATTGGGTTTTTATTGTCTATTACAT 
ATACTGTTTTCATATTTTCTGCAGTATAAACTTTCATATATATTGGATAAACCCCTTCTG 

50 AAGTGTATGTATGTGTGAAAGTATATGGCGATTTTTTGGGTTTTATCCAAACACTGCCAC 
CATCTCCGAAATATATATGATGCCAATACCAAGTCCACGATGCCGGCTCTACTATTGTTA 
TATTTATTGGATAGTAAGTAGGGGCTATTGTAGGAGAAGCATAAATTTGAGGATAACTGC 
TGTATCCTCCAACTCCAATTGGTGGTGGAATTCCAACCTCTATATTTCCATTATCATCTA 
TAACGAATACATGAGGATAGTATAGTCCACTTGAAGAATATCTATGAGTAGGGCTTTTTT 

55 

CAAATGAACATGTCCCATCTCCAAAACACCACATTATAAATATTGGATTTCCATAAGGCG 
AACAATCAAATCTAACGTTTTCATTTACACTAACTTGAGTTTTGTCAGCAGTTGCTGTTA 
TATCTATATAATATCCATCTCTCAATTTTACATTAAATTTTGGAGTCGAAATTACATCTG 
AATAGTAGTATAAATTAACCGTATGGTTTTCTTTATTATATTCATAATCATAGTAAGTTT 
TATCATGAGCAGATGAAGGGTAAAAATTATATCTTGTGTTTTCTACATCGTCAACTACTA 
60 TAAAGTTGAGGGTATCTGATTTCCACCAACTACCCCACCCATAACTCATCCAGAAGAACG 
GCCACATAAATGGAAACTTATATTGATGATATGATGGAGTAAAGTAAGACTTATATGTAT 
AAGGAGTTTCTGTTCCATCTCCAAAATCCCACTTCCAATATTCTCCCCAAGCTCCACTCA 
TTTCAAATTTTATAGTGTCATTAACTTTATAGGTTATTTTATATGGGTCAGTGTATGCAT 
TTCCATTATTATTATTGTCATCTCCAGAGCTATTATATACATAAGTGTATGCCTCTCCAT 
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CATAGTGACTTGGTCCTGTAACCCAGTAAATATACCCCCCTCTTGCTCTTTTTACTTCAA 

TTCCTTCATCCAAATATCCAACCATTACCCTTCCAGAATCATCAACAACTAAAACTCTTG 

GATAATAAAGCCCTGACTTTGTATATGTATGTTCTGGAAATTTTTCAAAGGAAAAAGTTC 

CATCTCCAAAACTCCATACACAGAATATTATATTTCTACTAACTGAAAAATTAAATTTAA 

CAGTATCTCCTTCTACAATTTCATCTCTGCTAACATTTACAGTTACTGAAGTTGTATCAA 

CACTTAAACCATTAAACTCCCTATCAACAGGGGTTTCTGAGTAATATTTTATTATAACAG 

TATTATTTGTGCTATTATAATAGACCTCCCAACTTGTCTTTGAATTCAACGGACTGCCAT 

TAAATACATACTTAGTATTTGCTACATCTCCAACAACAAGCCAGTTGTAAGTTAAAGCTT 

TCGAATAGCCAGTATTATTTAGATAACCGCACCATGCTACTGGATAAGGAAATGGAAATG 

TATATGTATGGGTGGTTGTTCTATAATTCCCATAATCAGTCTCAGTTAAGTCACCAAAAT 

CCCATTTAACAACTCCATTATCTAATATATCCTGAGCTACAGAATCGGGGGCTAAAGCTT 

CAAATGTTATCGTGTCATTTACATTGTATGCAATGATATTGGGGTCTGTGTTATAAACTC 

CAGAACTGTTAGTTATGTTCATAGTGTTAGGATGTATCACAATAACATATCCATTTACTA 

TTGAAATTATACCAAGAATGATTAATGGCATTAAAATCTTTAAAAGTTTCAT7VATAACCC 

ACCAAAATTTTAAGTTATATTTATTGTCAATTCTTTACATATTGTTATATTATTTTTATC 

AATAGTCACAGTTATGCTTATATTTTTTCCAATATCAACTGGGGCAGTTTCTATATTGCT 

TCCAGAAATTATGACACCGTTATCTGTTGGGGTAAATACGATAAGGGTTTTATAACTCAC 

ATTAATAATTTTATTCGAGACATGTATTACATACCCCAAATCTCCAATAGGTTTTAATTT 

CAAAACTATTGTTTCATTTTTTGTATATGAAAGGATTGCATAGTTCTCAAATGTATCGGC 

TATACTGTACATCCTATCCACTATCAAAGCATCCGTAGTGTTATTTGTAAATGTAAGTGC 

ATTGTAATAAATAAACAGTGAAACCAACATTAAAAATAATATTGCAAGTACAAAATCAAC 

AGATAACTGTCCCCTTTTTTTTATTTTGTTTTTGTTCATATTAAACCTCTTTTTTTCTAA 

TTTTACATTATCTTTTATGCTGAAATTAATACTATAATCATATAATAATATAGTATTATT 

CTTTCTTAATTTATATTTTTCCACTAAAATTGGAGACTGTCAAGTTAAGTTTTTATCAAA 

ATATTGATAAAAAATAATAAAATATGAGGCTCACGATAGAAGTTATAAAGGAGAGAATCG 

TAGAGAGGAAGCTTTTTAAAAGGAATAGGAAATCGATAGAGGTT7VAAATCTTAGCAGGGC 

TTTTGTATTACCTCGGATTATCGTTAAGGAAGGTAAGTTTATTCCTTTCCCAATTCGAAG 

ACATAAGCCACGAATCGGTTAGAATTTATTATCACAAGATTAAAGAAGTTTTAAACGAGC 

CAGAAAGAAAGGAAAGAAACTTAATTGCAATCGATGAGACTT^CTAAAGGTTGGAGACA 

AATATATTTATGCATGGTCTGCCATCGATGTAGAAACGAAAGAATGCTTAGGAGTTTATA 

TATCGAAGACAAGAAATTACCTCGATACTATATTATTCGTTAAGAGTATATTAAAATTTT 

GCTCGAATAAGCCAAAGATTTTAGTTGACGGTGGAAAGTGGTATCCGTGGGCGTTGCGAA 

AATTAGGCTTAGAATTCGAAAGAGTCAAATTCGGACTAAGAAATTGCGTAGAAAGCTTCT 

TCTCAGTGCTCAAACGAAGAACTAAAGTATTCTACAATAGATTTCCAAATAATAGTAAAT 

TCGATACGGTTATTAGCTGGATAAAAAGCTTCATGATGTTCTACAACTGGATGAAATCGT 

TAACTTGACAACCTCGATGGGAACTAATAAGGTTTTAAGATAACATCTCGTGTTTACTCT 

ATTTATAGATTCTAAATTTTTMTGCTAAATATTAGGTATTGCTATAAATATTTAATGCA 

TAAAGATTTAATAATACATGGTTACATAGTGGCATGTTTAATAATATGTAGCATTTTTCA 

AAAACTTAATAAAATTTTAAAGAATTAATATAAGCCTAAAAGTGCCTAATAGGACTTTCG 

CAAGAATACAATTCTAATTGAATGATAACACCGTTAGATATCAAGTAACCTTAACAAATC 

TATAAACTGCAAAAGTCCTATTCAATGTTATGAGGTGGCATAATGTTACAAAGATGTATT 

AAATGTGGAAAAACTTACGATGTGGATGAGATAATCTACACCTGCGAATGTGGTGGCTTA 

TTGGAGATTATTTATGATTATGAAGAGATTAAAGATAAAGTTTCAGAAGAAAAACTAAGA 

AAGAGAGAAATTGGAGTCTGGAGATATTTGGAATACTTACCAGTAAAAGACGAAAGTAAA 

ATTGTAAGTCTATGTGAAGGAGGAACTCCATTATATAGATGTAACAACTTGGAAAAAGAG 

CTTGGAATTAAAGAACTCTATGTAAAAAATGAAGGGGCTAATCCAACTGGAAGCTTTAAA 

GATAGGGGGATGACTGTTGGAGTAACT^GGGCAAATGAGTTGGGTGTTGAGGTTGTTGGC 

TGTGCTTCAACAGGAAATACATCCGCTTCTTTAGCCGCTTACTCAGCAAGAAGTGGAAAG 

AAATGTATTGTTCTATTACCAGAAGGAAAAGTTGCCTTAGGAAAGTTAGCTCAAGCAATG 

TTCTATGGAGCTAAGGTTATTCAAGTCAAAGGGAACTTTGATGATGCATTAGATATGGTT 

AAACAATTAGCAAAAGAGAAGTTGATTTATTTATTAAATTCAATAAATCCATTTAGATTA 

GAGGGACAGAAAACCATAGCATTTGAAATATGTGACCAATTAAACTGGCAAGTCCCAGAT 

AGAGTTATTGTTCCAGTTGGAAATGCTGGAAACATCTCAGCTATATGGAAAGGATTTAAA 

GAATTTGAAATTACTGGCATTATAGATGAACTCCCAAAAATGACCGGAATTCAGGCAGAT 

GGAGCTAAGCCAATTGTTGAAGCATTTAGAAAGAGAGCTAAAGACATCATCCCATATAAA 

AATCCAGAGACAATTGCAACAGCTATAAGGATTGGAAATCCAGTAAATGCCCCAAAGGCT 

TTAGATGCCATATACTCCTCTGGAGGTTATGCTGAAGCAGTTACTGATGAAGAGATTGTT 

GAAGCTCAAAAGCTATTGGCAAGAAAAGAGGGAATTTTTGTTGAACCAGCTTCAGCTTCA 

TCAATAGCTGGGCTTAAT^GTTATTAGAAGAAGGAATTATTGATAGAGATGAAAGAATT 

GTTTGTATAACAACAGGGCATGGGTTGAAAGACCCAGATGCAGCTATAAGGGCAAGTGAA 

GAGCCGATAAAGATTGAATGTGATATGAATGTTTTAAAAAGAATTTTGAAAGAGTTATAA 

ACAATAATATTTTATTATTATATTTTTTATGTCTCTAAAATAACTTCAAAATAACTCCAT 

AGAAATCATAAATCTATATATAAATCTATATATACGGTCTTTAGAAAAGTTATTAAAATC 

AATATGGAATATTT7VAACGTCTTCCAAAAGGAGGGTTCGAAACAGTTTTTAATTTTCTAT 
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AACTTACAGTAGCATATCATAATAAACAATATCACAATATAAATATTGTTTTTTTATTAA 

AATAGTAATATGTATTGTTATATCATAATGTTAATGAGGAGGCTTTGCCTTCGAGACGAA 

ATGTTGATACTAAATATTAACGT^GTTTGGATTTTGGGGCTGTATCTGTTCAGTCCTAAG 

TCTGATGAACTTATAGTGAAGGGAATGGTGTTCCCGATGAAGCTATGGGCTGAGGACAAC 

CCATTTCCATAGCTTACCGATTCGTATAGTAAGTTATTAAATGCTATGGTAAGCTATGGA 

AACGGGAAACGGTTAAATAGATCTTGGATTATATTAAACATTATCTAATTATTGAGATTT 

CTTCTTAATCTTTTAAAGGTTTTAATCATGTATTAAGAAAATTTGGATAAAAATAGAAAG 

CTATATATAGGAGTTTAGGTATAA7VATAAGAGCAAAAAGTAAGGGTTTAAATCGATAGTC 

CATTAAAACAAGGATAAACTCTAAAAAAGCAAGATTATTCTTTAACTCTTTTACCAACAG 

CTACGTATATGTTGTTAGCTCCAATTTTATCTCCAAATTTGGATAAAACCTCTATATTTC 

TCTCTCTACACAGATTTTCTACCTTCTTCCTACAAGCATCTTTTGTTTTCTCCCCAAAAG 

ATACAACAACTATATTTCCATCTATTTCCTCAGAACCTAAGACAGGTTTTTTTATCTTTT 

TGTATAACCCATCAAAGTCCTTTAAATGTGTCACAATCATTTTAGCATCTTCAATAAATG 

GAAAGTCCTTTAAATACTTGGTTATAAATGATATATCCCTTTCTCTATCTTGCCCATACT 

TCTTTAAATCTACTTTGTAGTAGTTGAGTTTTCCATCCTGATATTCTTTTATAATTGTCT 

TAGCTGTTCTAACTAAATCAACTTCTCCACCTTTGGTTAAATAACTCCTTrTATTTCCAA 

TCTTTTTTAATAACTCTTCATCAACCTCTTCATAATCAACTCCAAAGTATTCTTTTATTA 

TTGAGTTATCAAAGTTATTTATCCTACTTAAAATCTTTAAAGCTGGAGGAATAGGGTTTT 

CTACTTTTTCCAATCTCAAAGCTCCACTTATAACCAAATCATCCTCATCTCTCATCTCCA 

AAACTCCAGGAGTGTCCATAAGCTTAATATTTTTAGTTAATCTAACCCACTGCTCTCCTT 

TGGTTAiQJ^LCCAGCTACACTTCCAGTTAAAGCTTTTCTTTTTCCAGTTAATGCGTTAATAA 

TGGATGATTTTCCAACGTTTGGATAACCAACAATTCCAACTTTTCCTTCTTTTTTACCCA 

TTTCTTTTAAGGATTGTTTTATCATCTCTCTCAAAATTTTTGTTCCCAATCTTCTCTTAG 

CAGATACAAATACTGTATTTTCCCCAAAAACTTCTTTCCATTTTTCTAAAATATCTTTTG 

GAACTAAATCAGCCTTATTTAATACATAGATTAGCTTTTTACCTTTTGCTTTGATTTTTT 

TCTCCAACTCTCTGTTTCTTGTCATCTCTGGGTCTCTTGCATCTAATACCAATAAGATGA 

CATCACATTCATCAATAATTTTATTAACTATTTTTTTAACTGGTACTTTCTTGTATCTCA 

TAACTCTCACCATCAAAAAAATGTTATATTCCTCTCATTTATATTTTTTATCAATGAATA 

TGACAAAATAAATTTATAAATTTATCGATTATAGT^ATTTTTTATAGAACTTCAAACAC 

ATTTACAAATAGTTAAATTTTCAATAAAAAATATGAATAAAAAGGTGATATTGTGGTTGT 

AGATGCAAAAGAAGTAGAGATGATAAATACCTTAGTTTTTGAGACATTAGGAAATCCAGA 

GAAGGAGAGAGAATTTAAGTTAA7VATCATTGAAGAGATGGGGATTTGACTTAATATTTGG 

TAAAGTAGATGGAAAAGAAACATATTTCACTGTTGAATTAGATGAAAGAAAAGCTGGAGA 

TAAGTTTTCAAAGGATGGAAAGGAGTATGAAGTTATCGAAGTTCTTCAAGAATTGCCAAA 

AAACACTGAGCTCTATGCACACATAGAAATGGAGATGGGTAAAGCATATATTGTCTGTCA 

ATTAAGAGATGAAGATGGAAAAAACACAGAAGTTTTAAGAGTTCCAGCAGCTACTTTATT 

GTTAGCTTTCCTTAAAAAGAATAAATTAGCAAACATAATAAAAGCAATAAAGAACGTTGG 

AATTAGTTTAGAACTTTCCATGCAGAATGGTGTTGGAGGAAAGCCATTATCTTATGAAGA 

ATTGCCAAACGTTGCAAGAAGGTTTATAAGAAGTGCAAGAAAGGTTGAGAAAGAAACTGG 

TTTTGGAAGGTTGTCATTTGCATACTATGGAGAAACAAAAGATGGAGAACCAAGATATAG 

ATTTAGCTGGCTGTTGCCAACAATTGCCTTATTTGACTTAGATATAGCTAAAAAAGTAGA 

ACAAACCTTGGGAATCTTAAAGGTTTCTGAATAAATAAAATTTTTTGAGGTGAGATGATG 

ATTTATGGGATTTTGTTAAATATTCCAGAAAAACATGCTACAAAGTATGAGGATTTAATT 

AGGAGAATAATTGGAGAAGGAATAGCAAGAGGAGATATCTTATCATTTACAGAGGCAAGA 

TACAAAGGAGATGTCGCTTTTGTCATGCTTGCAAGGTCAAGGAGAGCGGCTGAGAAAGTT 

TATCAGCAACTTAAAGAGCATCCAATCCATGTAAAGGTTATAGAGATTGAAGGAAAAGGA 

GATTAATAGTTCATAATTTGTGAAAAAAAATTCTTAATATTTTTATACCATAATTTATAT 

TTTTTATATGTGAAGTATTTCATTATCGTGTAAGAGGGGAGAATATGGAGCAATTTGATT 

TTGATAGCATCTTCAATAATGCAGTAGGTAATATGAAATATTTCATTAAAAAAGTTAAAA 

AATACGAAGAGATTAAAAAGCATGAAGATATATTAAAAAAAGATTTATTAAACGCTGTAA 

ATGTGTTTATAGAGAGGTTTAGAAATAATCCATGCATCTGCAAAAATAGGAATAATCACA 

GTAGTTGCACCACAAACGCATGTGGGGAGATAGAAAATCGCATGAAAAACTGGGTTGAGA 

AGTTATTTGAATATAGTGATGATGAAGAAAAATTAAATGAATTTTTTAAAATTATAGCAA 

AAGATGCAATGAAATTTGTTGAGTTGGATTTTGAACCGTTGTATATTTTATGTGGATTGG 

AGGAAATAAGAGAGACGGCAGAAGAAAAATTAAAAGAGGAACTACCAACTGAAGAGTATT 

TAAAAGTTATGGAAGAGTTTGATGATTTAATTGAAAGAATGTCTTTGGTTGCCACAGCTG 

TTTATATGGAGTTCGAAGATAGGGTTTTTGAAAGAATGGGCATAAACAAAAACTTAAAAT 

ATAATATTATCAAGTTGGGATTGAAAAAGATGAATATTAATTAATAAAAATTAATAAATA 

ATACCTATTTTTTAATATTTATTATTACAAAGTTTTATATATTTTGTTTTACATAGATGT 

TATTGATTAGGTCATAACACTAAATAATTAAAAAATATATTAAAAAAGAAAGGTGGCTTT 

TATGGAAAAAATCTTTCCAGACATTTTAGAAGCAATAAGAAATGAAGAGATAATAAAAGA 

AAGTAAAAAAATTCCTATGCCATATTTTGGGTTGTTTGCATTGGTAATATTTGATAAAGT 

TAAAGAACTTGGTTCAGAAACCTCATTATATGAAATTGGTGAAGAATTTGGAAAAATGTT 

ATCTCCTAAAAATATTGAAGAATTGAAAAAAATATTCAAATTAATGAATTTTGGAGATTT 
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GGAGATTGACG7VAAATAAAATACTTCTCAAAAATCCACCATATAAAATAAAGCTATCTAA 

TCCTCCATACCAATGGGTATCTAAAGAAGAACCAATTCATGATTTTATAGCTGGAATCTT 

AGCTGGATGTTTAGAAGAGATATTTAAAAAGAAATTTGTTGTTAATGAGGTTGAATGTGT 

TTCTCAAGGAAAAGATAAATGTGTGTTTGAAGTTAAGGAAGTTGATGAGCTAAATAAATA 

AATCAACCAACTGCATATCTTTAACCAAACCTTTATCAGTTATTTTTAGCTCAGGAATCA 

CAGGGAGAGAGAAAAAGCTCATACTTAAAAATGGGTTCTCAAAAGAACTCCAACCTTCTA 

TTTTTTTATACAAAGCATTAATCTTCTCAGCTATGTATTTTCCATCATCTCCCATTATCC 

CTCCAACTGGTAGAGGAAGATATTCAACAACTTCCCCATCCTTAGCAGCTATAAATCCTC 

CACCAATATCTTTTAATTTATTTACAGCTAAGGCTAAATCTTTCTCATTATTTCCTATGG 

CTATTACATTATGAGAATCGTGAGCATAGGAAGAGGCTAAAGCTCCCTCCTCCAAGAAGT 

TGTATATTAAACCCTTTCCAATATTTCCAGTATTTTTATGCCTCTCTATAACGAAGATTT 

TATTTATAGCATTTTCATTCAGTAATATTTTTATTTCTTCAGTGCTAAATATTAGCTCTT 

CAGTTATTAGAGAATCTTTTAATGGTTTTATTACTCTAATAAATCCATCTCTCTCCTTAT 

AATCAATCCCTTTAATTAAAAAATCACCTTCGTTTTTGTATTGGTATTTTAAAGTATTCA 

TGAGCTTTTCGGGAATTTTTCTTTTTTTATTTTTATTTAGTTCATTTAAAACATCATCTA 

AGAATCTTCCTTTTATGACAATGTTATAAACTTTAAAATTGTCTAAATCTTCAAAGATTA 

CAAAACTTGCCTCATTTCCAGCTTTAATTCCTACATCAAACCCAAAATAATTTGCTGGAT 

TTATTGTAACCATTTGAATAGCTTCAATTGGAGAAACATAGTTTGTGGCTTTTCTTAAAA 

TATTTAACATGTAGCCGTCTAAATCTTTAATACAGACGTCATCACTAACCAACATTATAT 

TCCTAAAATCTTTTATCTTTTTGCATATATTAAGCAAATAGATGTTTTTTGATGCTGTTC 

CTTCTCTAATCATTAATTTTAATCCCAATCTAAGCTTTTCTAATGCCTCATCTTCATCAA 

CACTCTCATGGTCGCTCATTATTCCATGAGATATATATTTGTTTAACTCCCAACCTTTTA 

ATTTTGGACAATGCCCATCTATCAATTTATTGTATTTTTTAGCTACTTCTATCTTTTTTA 

ACATCTCTTCATCTTCATTTATTACTGCAGGATAGTTCATAACCTCTCCTAAACCTAAGA 

CATTATCTAAAAGAATGAGTTCTTCAATATTCTCTGCTGTAATCTCAGCTCCACTTGTTT 

CTAAGTTTGTAGCTGGAACACAGGAAGGAAGCATAACATAGACATCTAAAATTTTGGCAT 

CATTCAACATAAACAAAATTCCTTCTTTTCCAGCAATATTTGCTATTTCATGCGGGTCTA 

TAACTACTTTGCTAACTCCGCTTTTTAATACAAATTTCTCAAACTCTGATGGGATGAGAT 

GGGAAGATTCTATATGTATATGCCCATCTAT7UVATGTTGGAGATAAATATTTTCCTTTTA 

AGTCAATAACTTTAACATCCTCCTTTATTTTTTCAATTATCTTATCAATTTCATCATTTA 

AATCCACAAAGGATATTTTATCCCTCTCAACTGCAACATTTCCTTTAACAACCTCTCCAG 

TATATACATCAATAATCTTTGTATTTTTGAAGACAATCATAGAGCTCTCCCTTTAACCTT 

ATTTATGTTAAAGAAACTTTTTAGGAGAAAATTAATAGGAAAAAATTAAATGAAAATCAT 

GGAGTTTCATAACCCAJVAGCTAACGCTTCGGTTTCATCAATUUSlTTATTAAATTATCTTTA 

TAGCACCCTACCTTTAACCTTATTTATATCAAATTTTGCTGTTTCAGCAATAGCCATAAC 

agctaaaacccctgcctgcaatgggtctccaacaaccaaatcagcaaccttaggaacaga 
gccaaacatctttaagcttatcacgggaatgccagtcttttcctttaattctttaactgc 
ttcagttatcttccctcccattaaagagccagctaaaactaaaattcctactcgtggaag 

AGTTGCTACAGCTTTAACAGCCTCATATAAATTTTCTTCACCAACTATTGGAAGAGTATC 

tacgctaattctctcccctcttatattatgcctgtctgcctcacttatcgcccctctcgc 

AACTTCAGCAACTTGTGCCCCTCCACCAATAATAATAACTCTCTTACCATAAATCTTTTT 

taatgagctgtgaatttcaaagctctttacacactcacaactctccattcttctctttag 

CTCCTCAATATCTTTAATCCCTTCAACTTCCATATAAATAAATCCAATTTTACCATCATC 

tttaatgaattgttgagtataggttatattccctcccaattcagaaagaattcccgtaag 

tttgtgcaaaactcctactttattttctgcctctatgctgattccaatttccatgttctc 

acattaaatttatttaatattgatgaaaatcatcaaaaataatattatttaaaatttaaa 

agaagctatcgcctatatcattggtaatgttatctatttcatcagttatgtcctcaatta 

cattatctactcccttgtcaatttcttctattgtgtcttctattgttgtttctattcctt 

cagaattatacccatcagcttcgttattttcgttattatttatcacatcttctattgcat 

cagttatcaactctccagctataactccaccagcaacagctgcagcagttcctaataaat 

tgctactatctctctcaactacaactgttctattggctgttccatttacagtcttattct 

ttctactaaatattaaatatcccagtatcaatccaaaaccaactaaaataaatgctaatc 

caaaaaataaaattaataaggttagtgctgtcataaccatcacataaaaattattttaat 

cttcttctaacagcttcagcatgcccaaacaaaccttcagcttcagctaatgtgataaca 

atatcagcaatattttttaagctttccttatccaatttttgatatgttattttctttaaa 

aatgtctctacattcaaaccagaactcattctcgcaaactgtgaagttggcagaacatga 

ttagttccagaagcataatctccaacaggaactgggctatactctcctaaaaatacactt 

ccagcatgtttaattttatttaaaacttcctctggatttttagttaatatttcaagatgt 

tctggggcatatttatttgagaattcaatacactcttctaaatcaccaattaatatggca 

gagttttctaaggcttttaaaataatctcctttctttcagctttttctatctcttcaaat 

atcttgtttttaatctcctctgccttcttttcagatgttgttgttattacacaagaggcg 

ttagggtcgtgttcagcttgggcaataaaatctaaggcaacaaactctgcattagctgtt 

tcatcagcaataattaaaacctctgaaggacctgctaagaaatctatggcaacttctcca 

taaaccatctttttagctgttgttacatatatattcccaggccctacaataatatcaacc 
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TTTGGGATAGTCTCTGTTCCATAGGCTAATGCCCCTATAGCTTGAACTCCTCCAACCTTA 
TAAATAGCTGAAACTCCAACAATATCTCCTGGTATTAAGGTAGCTGGATTTCCTTTCCCA 
TCTTTTGTAGGTGGGGAGGTTATATATATCTCTTCACATCCAGCAACCTTTGCAGGAATT 
GTTGTCATTAAAACAGTTGAAGGATAAAATGCCCTTCCTCCAGGAACATAGCATCCAACT 
5 TTTTCTATTGCTCTAACAACCTGTCCT7VAAATTATTCCATTATTTTCAACATTTAAATCT 
TTTATTTGCTCCATCTGCTTTTTATGGAAGAAATAAATGTTTTCCTTAGCTCTCTCAATA 
GCTTCAACAACTTTAT7VATCAACTGAGTTATAAGCTTCTTCTATCTCCTCATCTGTAACT 
TTAAAATCTTCTATTTCTACACCATCGAACTTTTTTGTATAATATTTTAATGCTTCATCC 
CCTTTTTCTTTAACATCCTTCAAAATCTCCATTACTGTTGGCAATATTTCCTCAAAGTTT 
10 GCTTTATTCCTATTAATTATTTTCTCCTCTTCTTCCTTTGTTAATTCTTTAATTTTTTTA 
ATTATCATTCCAGTCACCATAGATTTTTAATTGACAAAGTTTATATATAGTCAGTGCTTA 
TATTATTTACTGTATAAGAAAAAATCAAGGTGAGA7VAATGATACTCTTCGAGTGGGGAAC 
TTATAACGCTTTATCAACATTAAAACAGGCAGCATTATTGGGGACAAGAATTACAGAAAT 
TCCACCAGCAGTGTTATCAAGAAGATTGCCATCCGGATACTATGAGAGTTATAAAAAGTT 
15 AGGTGGGGAGTATTTCACATCAATCTTAGCTCATGGGCCTTATTATAGCTTATCATCAGA 
GAAGGGATTGTUU^GGTCATCTTTCAGCCATAGTU^AAAGCTACACTATGTGGAGCTGAGAT 
ATACAACTACCATCTTGGAAAAAGAGTGGGGGATGATTTAAACTACCACTTAGAAGTCTT 
AAAAAAATTCAGTGAAGTTAATAATGAGATGATTTACTCTCCAGAGCCAGCAACAAATAT 
TGGAGAGTTTGGAACATTAGATGAGCTTGAAGAGTTAATAAAAGCGGCTAAAGAGGaAGA 
20 TATaAAAATTATTCCATCATTACAGTTAGAAAACATATTCTTAAATGAATTGGGAGTTTA 
TGAGAAGGATGATTTAGATGAAGCAGCTGA7VAAGGCAGATGTTGATTGGTGGCTAAAGAT 
TTTCAGAAGAATGGATAAAATATCAGATTATATAATGCATTTCAGATTTTCACAGGTTAT 
TGGGCTTAAATATGGAAAGAGATTCTATAAGAAGAGAGTTCCTTTAGGAAAAGGGTATCC 
ACCAGTTGAGCCATTAACTGAAGCTTTAGCTACATACTTAGTAGATAACGCTACAAGAGG 
25 GGGATTTAAGAAAGTTCTATTTGTCTATACCGGATTGCCAGAGGTTAAGTATAGGGATTT 
AATTGACTTGTATGCAATGATTATGAAGAAATCCATCGACAAGTTGATGAGTAGAGAGAG 
CCAGGTTGAATATGGCGATTTCTATAAAGTTATGAGTTCAGAAGAGGAAGAATAAATTTT 
CTATTTTTTAGCTTAATTTTATATTGCATTAAATTTAAAATATTTTGCTTTTTAATTTTT 
AATTAAATAAAACTTTTAAGGGGAGAGAATATGATATGTTTGCCAGTAGTTGAAGATAGT 
30 GTAGAAAAAGCAATAAAAACAGCTGAAAAGTATTTAGAMTAGCAGATATTGTTGAATTT 
AGGATAGATATGCTTAAAGAAGTTAGTGAAGAAGATATAGAGAAATTTGCTAAGTATCCT 
TGCATAATAACTGTTAGAGCAGATTGGGAGGGTGGTTATTGGAAGGGAAATAATGAGGAA 
AGATTAAACTTAATAAAAAAGGCAATTGAATGCAATGCCAAATTTGTTGATATTGAATTG 
AGAGAGGAGAAAAATAAAGAACTTGTAAAATTTAGAGATGAAATTGGTTCTU^AAACAAAA 
35 ATTATAATTTCTTATCATGATTTTGAAAAAACTCCTTCTAAGGAAAAATTGGTAGAGATT 
GTTGAAAAAGCTCTTAGCATTGGAGATATAGCAAAATTTGCAACAATGGCAAATAGTAAA 
GAAGATGTCCTCAATATCTTAGAAGTGATAAATAAATATCCTGGAAAGATTATTGGTATT 
GGAATGGGCGAGAAAGGGAAACTAACAAGAATCTTAGGGGTTTATTTTGGCTCAATATTA 
ACGTTTGCTTCATATAAAGGGAAAAGTTCTGCCCCTGGGCAGGTTGATATTGATACATTA 
40 AAAGAAATCTGGAGACTAATGGATTTAAAGTAAATTTAAATTTCTTAGCATAATTTCAGC 
TAATTGTTTATGTTCTCTACCTCCAACTTTTTTAATTATTGAGAAATATTTTCTAATGTC 
ATTTATCATCTCTTTTTTGTCATCTTCATTAATTTTTATATTTTTGTTGGCTAATCTTGA 
GTAGATAACTGCTATCTCCACCAATAAGCCGTCTGCCCTATTGTATGGCTTTGGGATGTT 
GTTTAAATATATCCTCTTTATTTCCTCTCCCTCTAAAATCATCAATCTGTTTTTTCCAAA 
"+5 TTTATCTTCCCTATCAACAAACTTTCTTTCGATAACTTTGTAAAATATAGCATAATAAGA 
GCTTTTTAGATAAGGAATGTCATTATAATAAAGATAATCATCCTCATCATCTAAAACTGC 
CTTTGCTATCTCAATTGGTGGAACCACATTAACTGAAAAATAATCTTCAGTT7UVAAGATT 
TTCATAAGTATGAGAGCCAGAAAAAAGATGCATAATAACTTTTTTATCTTTAAAATAAAC 
ACCAATTGGGGCTTTATTGTCTCTATTATTTTTtCTTGTTGTTACAACCACTTCACTTAT 
^0 CATGGTTATCCCAAATTTACTTTTATGCTCTCTAAATCTCCTCCTCTTCTGTATGATAAA 
TTTATCTTAACTCTTCCCTTATCTACCTTTTTATTTAATGGAATTACCAATGGAGGATTC 
AGCATTGATGTTTGTCCAGCTACATGTTTATCATCCAATATTGTATAGGTTCTCAGCTTT 
ATTCCTAAGTTTTCACAGCTTTTTTCAAGCTCTAACTCTATATTATAGCTTACTTCAATA 
GGATTTGTCTTATGAAAATCAACTTCCTCATAAATAACCTCTTCAGAAACTTCCTCTGAT 
55 TTAATATCTTCATCATAATAGATATGGCTCATTTTTGCCTCTACAAGTTGTATAGTTGAT 
ATTGCCTTAGCTGGGATTATTTTAACATCTTCTTTTAAAAAACCTCTTTCTATTATTGAA 
TTCATAACTTTAACTTGTGGTTCAATAATTAAAGCAGTGTCTAAAAGCTCAGCTATAACC 
ACATCAGCCTTCTCTTTAAAGTTGTAAGTTGAGGCATCTCCTTCAATAATCTCAATGTTA 
TTAAATCCATTAACTTTTATATTTTCTTTAGCATAATCATAAGTAAAAGGGTCTAACTCA 
"O ATGGCATAAACTTTTTTTGCTTTCTTTGCAGCAATCATTGCTAAAATTCCACTACCTGTT 
CCCAAATCAAAGACAACGTCATCTTCATCTACAACTCTCTCTATGGCGTTTTTAAAGATA 
GCCAATCTCTCATAGTCAGTTAAT/^GAGTAATGCCATTGTGGAACCTTTAGTCTTAAT 
TTCATGTTATCCCGTCATTCATATATTTATAAACGCAAATAAATTTATAGATTGAAAATT 
GCGTAACCTATATGCATCAATTTTTCATATTTTTCAAAAATGAAAAAATATATATAGGGG 
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ATAAGTTAATAATATTATCCTGTGGGGGAATAATACGAAATGTTTTGCTATTTTATCATA 

AACTTTGAGATATGGCTTAATTAGATAATGTTAAACATAAGGGGAGGGGTTTTACGCCTA 

AAACCATATTTATATAACATTTTTACAGACATAATTTAAAAATATAATTTTTGGTATTTA 

ATCTCTTATCATACCCCTTTCTTTTTGCCATTTTCTCCTTAAACCTAATATACACCCTCC 

TCCCCTAACTCCTCCAATAGGCACTTGAGGAGTTCCCAAACAGTTCATACATCTTGCCAT 

AACTGAAGCACCTAAAGCTAAGCCATCCTCAACAAACACAACATTTTCCTCAACTTTATC 

CCAAATTTCTAAAGTTTTAAGCTTCTCAATAATTAATTCTGGCTTTCTGCCAGTAATCCC 

AGCCCTTCCTGTAATTCCTACAGCTGACTTTTCACTAATTAATCCTTTTTTATAAGCTAA 

CTCTACCAACCTTCTAACAACTTCACTCATAACATAGTCTAAACAGCACATCAACGTTGG 

AATATCACTCTTTTCAACTAATTCCCTCCCCAATTCTTCCAATTTTATCAAATCACTACC 

ATTCTTACCAACATCACATCCAATTAATGTAGTTCCAGCCTTTTCAGCGGACTTTGGGTC 

TACTGGGACAGTTCCAAATCTATCAACATCTTTTGGGACTTCTTTAATAATTATATATTT 

GTGCATTTCTTCAGCATATTCCTTAGCTAATTCTTCATTTGGCTTTCCTTTTATATTTGC 

TAAATCTAAAGCCGCTCCAGTCTTCTCATCTATTTTTCCAGAACCCCTTGCAATTGCATC 

AGCTATAGCTCCAGCTAAACCGCATAAATTACCAATAACCTTTGCATAAGGTAAAGTGTC 

ATTAGTTATTCTACCAGCCAAGGTTGTTCCAAAGTCAATACTCATACAAGGATTTCTGAA 

ATCTACATCTGTCCATTTACTTCCAACTTTTATTCCTGCAGTTACAAGCTCTCCTTCCAT 

CTCGTTAGCTACAACCTCCTTTCCTGTAGGAGGCAGAACTCCAGTAACCGCTCCATCAAA 

TATAATCTTATCTAAAAAAGAATATTTATCAAACGGCTTTGGTATCTGTTCCTTAGTCAT 

TGCTGGAGTCATCTTTGCTGGAGGAACTCCAGCTTTCATACATCCTTGAGCTAAGGCAAT 

AATCATCTCTCCAACTTCTTCTGGAGATGCAAAACCTGCAGTAACTCCAGTACTTCTAAC 

AACAAAGTGTAAGTCATCAACAGTTAGTCCAGCTTTTTTTAAACTCTCCAACAAAACCTC 

TTTAACCATATCTGCAACTGCCTCTCTTGTTAATTCAACCCCCCATAGTGTCTCTCCAAA 

AACTTCTTCTCCTTTCTTTGGCTTTCTGACATCCCTTGTCATCTTCACGTGCTTGCTAAC 

AATGTAGGTTTTACCAGTATCCATATTTGTTGCTGTTATGATGGATTTTGTTGTTGTATT 

TCCTAACTCAACTGATGCCACTATATAGTAAGGATTTCTTTTTAACTCAATCAAATCTAC 

ACTTTGTGACTTTGCATAGGCAATTTTTGGCTTCTTTTTAAACAGTCCTGAGATGACATC 

AAAGATTCCCATGCTACCCCTCTCTACAAAAATATTGCAATAAAATATTTATCTCTGGCT 

TATGGTTTATAAAATCTCCCTTACAAATTTTTTAGATAGTGCAATAATTGAAACTATTGG 

CGGAACTCCCAAGGCTTCTTTAAATAAAGAGGCATCGCAAACATACAACCCCTCTCTAAC 

CTCAAACTCATCAACAACTAAGCTTAAACTCCCCCCTGGATGAGAACCCCTTGGTATAGT 

TGTGTATATATCATCAACACCCAACTTGTATAAATATTTTGTTGCCTTACATATACCTCT 

TGCAAGAGTTTTGAAATCTTCCTTAGTTATCTCTTTTTTAACGTCGTTATCTAAAACCAC 

TCCATTGTTTTCATCCTTAATCTTTATCATAATCCCCACAATATCTTTCTCTTTCACATC 

CTTATAATCTTTTTTTATTTCGTTAATTAGTAGTTTTGAATAATGAGTTGCCAGCATGAA 

ATTTTTGTATTTCTTATAAACAAGCATGGAGATGTCTTTATTTAGATAGCTATCTTCTAA 

AATCCCACCAACAGTAACAAAGGTATCTATAAATAAGTTTTTTCCAATATTCTCATCGTC 

AATCATTTTTTTTAGAATTCTTGGAGAATTAATGCCTCCAGCAGAGATTATGAGATTTTT 

AGCTTTAATCTTTCTACCTTTATCATCTAAGATTTCGTAATAATTGCTATAATTTATTGC 

TTTTATGTTAAATTCAGTGATTATATTTGCATTTGATTCTTTTAGATAATTTAAAGGCGT 

CCATTTAGCTTTGCATATCTTTCTTGCACACTCTCCACATTTATTGCATCTATCAAAATC 

TATAAACTTCTCCATCTTTTCAAAGCCAAGTTCAATAAAGGCTTTATCAATATCATTTAA 

AAAATCATCTTTTGGAGCTTTAATTTTTAATTCTTCCCAAATTTCTTTATAGATATCTTT 

GTCTATTTTGTAGCCCTTAATTTCTGTTTTTATGGCATTTCCCAAGGAATAAACTCCACT 

CCCTCCCAAGCCATAGACATAATTTATTTCTACATTCTTTCCTTCTGAAGCATAACTTGG 

CTTTTTTCCCTTTTCTATTACTGCCACTTTATACCTATATCTCAATTCCTTGGCTAAGGT 

GGCTCCAGCCACTCCAGAGCCGATAATGGCAAAATCATACATGGCTAATCCCTATTTTTG 

CATATATTTATTGTATAATTCTAATATTTTCATCTTCCTATTTTCATTATACTTGTTGTT 

ATTACTAAACATTGAGTTATTAATCAATCTTTCAAAAGTTCTTCTATCTCTTGAAGATAA 

TTTTAATAAACTCAGCTCTTTTACAGCTTTTGCACATTCAATTTCAGAATAAAGTCTCCA 

TTCTTGAAGGATATCCATATATTTTTTGAATTTTTGTTCATCAATAGCATCTTTGGTGTA 

TAATCTATCTTGATTAATGAATGGAAATGTATTTATAAAATCAAACTTTTCTTTAAAATC 

AGGTTTTGCTATTGAATTAATAATATCCAATATTTTTTTCATAGTATTTTCTTCTCCAGA 

TAAAAAACTCCACAATTCTTCTGCAATTAAAACCTCTCTTTTATCAAAATACTTGGAAAA 

TTCTATTAAACTACTCATGAATCTATCTTTATCATATCCACAAGGATTTTCTGGATTTTC 

AGTAGGGTCATAGGGAAAGCCAATAAAATAATAAACTTTTTTATTTGGTTTTGTTTCCAT 

CATATAGGCTTTTCCATAAAGAATTTTTTGTTTTTCTCCTCTCATTTCTCCAGCATTAGG 

TCTAACAGTTTTTAACTCAATCATTACAACTTTATCTTTATCTTCAAAATAAACATCTGC 

AGTAAATTCTAACCCATTTACATATTCAGAATTTTTTGAAGTAGCTTCTCTTAATTCTTT 

ATTTTCTTTTTCCACATTTGGCAATCTTTCTCCACTTTTTAAATCATTTATAATCTCCGA 

TATTTTGTCTCTAACACTTCTTTTAATTTTATAGTTTTTAAATGTCCTTTTTTCACCGTT 

AGATAAAATATGAGCAATATTTTCAAAGTAGCTCTGCCCCAATGTTGTGCTTAATCCATG 

AAACCACTGTGATAAAGTTAAAAACTTTAATGCTTCAGTATCATCGTTTATCCCAATCTT 

CCCATAAAAAGCCCTTAAAAAAGCCATATGGT^TGGCATGTTTCTTATTTTTATGTCTTC 
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ATCTGATATTGTATCAAATCTTGATTTTAATACTCTTATTGTCTCAATGCTAATTTTTTC 

TATAACATTTTTACTTAGTGGCATAGCTATTCCTCCATTTTTAATTCAAAGATGCTTTCA 

TAGTATGGGTTTCTATCTCTTTCTGTTCTATTTAAGACCGGTCTTTTAAACTCTCTAACT 

AAAATAAGCCCACTTTTCTCAAAAATCTCTTTATATAGGTTCTTTTTATCATTAACTACA 

ATGA7VAATCTTTGCGTCTTCATTTAAAAATCTTTTCATGTTGATTAAAACATCGGATATG 

CCTTCAATATACTCTTTTTGTGCTTTTTTTGAACTACCTTTAAATTTAGGTCCTATCTCC 

AACTCATCCAATCTTGGAATGTCT^AAAGCTCATAAGCATAGGCATGCTGCTCATGATAA 

TCAATCTGCCCTAAATAAGGAGGAGATGTAAAAATACCATCAATTTTTTTGTTTTTATAA 

AGTTCATAAAAGTTTGGGTGTTTTTTTAGTTCTTCTTCAATATCAACAGTCCTTGAATCT 

CCATTAATGATTAAATAATATGCATCTTTCCTAATCTTTGAAAATTCTTCTATTCTACTA 

ATTACATCATTTGTATATTCTTCTAAGTGTCTTAAAATTGTTTGAACTGGTCTGCAAATT 

TTTTTATGCTTATAGCAATAGTATGGGTCAAAAACTGGCTCTTTTAGTGTGGCTAAATCA 

AAATGAGTAGTTCCTCTAACAGACCTTGCCGTTCTACTCAAAATTATCATTGCCACTTTT 

TTTATTGTTTCATCTCTGCAGTCTTTAATTAAATTTAAATAAAAGTTTAATTCTGCCCTA 

ATTCTTGGAGAATACCACTTATATAAAAATGGCTTATCTTTAAAAATGTCATCAAACTCA 

TCATCATTTTTGCAGTATTTTTCCTTAAGTTTTTTATACTCTAAATAAAACATTTCCATG 

ATTTTTTCGGAATAGCTATCTTCATCAATTTCTTTTTTTGATAATTTTCTTTTATATTCT 

AAGGTAAAGTATTTTTTGTTGTATTTCTCAATTAATTTATCCATTTCTTTAACAAATTCA 

TCATCTCCTAAATTTTTTGAAAATTCCTTTGTTTTATTTAGCATATCTAATAAAATTTTC 

TTTAATTTTTGAATATCATATTTCTGCAATTTAACTTCAGCAATTAAACAGTTAAATGGT 

GATATATCAATGCCAATAGAATTAATGCCCATCTCCATACATTGCACTAATGTTGTTCCA 

GAACCCATAAACGGGTCTATTATAATATCTCCAACGTTAAAATGCCTCTTTAAAAAATAC 

TCTACCAATTGTGGAATAAACTTTCCTTTGTATGGGTGAATTCCATGAACATGTTTAGTT 

CTCTCCTTCTCAGATAACAAATCAAATGCTAAATCCCAATCCAATTTAAATCCCAATTTT 

TCTTCCCATTCTTTTCTTTTTTCAAAAAATAATTTTTTATAATAGTTTTCAACCTCATCA 

ATATCTACATAAACCCTATTTTTGATTTTATACTTATTGACTCTTCCATACTGCACTAAA 

TATGAAATATTATGCTCTTTAATTTCCTTACCAAACTTTTTTGTTMTATTCTTGATGCC 

TCTTTTATTGTGTAAAGTTTTTTTGCTGGCTGTATATCTAACCATGCATCCAGATTCATA 

ATTCCTCCCCTATATCCTCATCCTCATCAAAATAAAGCCTATAAGTGTCTATATCATAAC 

CCATCTTGTTTATAAACATCCTAATTACTCTTCCTGGGTCTTTTGATTTTGTAATTGCTC 

TACCAACGATTAATATTTGATATTCTTTTAAAAGCTCTTCAACATTCTCCACACCAACTC 

CTCCAGCAATTGCTAATAAGCAGTTTTCCTTAAATTTCCATTCCTTTTTAATTCCAAATG 

TCTCCTCATCAATCCCTCTATGCAAGATAACAACATCTGGCTTTAATTTTAATGAATCAT 

ATAATTTTTGAGGTTCAGAGACGTTCATCATATCCAAATAGCTGATTAAACCACATTTTT 

GACATTCGTGGATAGCTTTAATTATTGTTGATTTTGGTGCTACTCCACTTATTGCCACTG 

CATTAGCTGTTGCTTCAAATGCCAATCTTACCTCAACCCTTCCAGTGTCTAAGGTTTTTA 

AATCAGCAACAATAAAGCCATCAAAATATTCTCTCATTATTTCAATAACCTCTAAACCAA 

ACTTTTTAATTAGTGGTGTTCCAGCCTCTAAGATGATGTGGTCGCTATTTGGAATTGTTT 

GT/^CAAAAATTCCAAATTCTCCATAGTTGGGACATCCAAAGCAATTTGTAGATATGGAG 

GATACTCCAATCTAACATCCCTAAATCCAACTAATGGATGCAAAGCTCTATATTTCTCTT 

TCTTTACCTTCTCTTTTGAAGGATATTCATTTAAAGCTCTGTTTATAGCTAACTTTGCTG 

AGGCATAGAAGTATTGGAAGAGTTTTCTTTTATTTAAATTGGTTATTGGAACCTCTGGGA 

CATTAACAGAGACAACAACCTTTAAATCTTCATCTAAATCTAAATCAGCAACTGCCTTGG 

CAACTGCATACTGAATAACTCCCTGAAATAGCTCATCCTGTATCTCACTCTCTATATTAT 

GCCTTGGAACAACTAAGGTTAATGGTTTAACTATTAAATTAGGTCTTAAATTGGCAAAAA 

CACAATTTCCTCTTGTTAAAGCATTTGTAAAGGTATTCTCAATTAACTCTCCTTTCCCTA 

ATGCAACATTAACTATTGCCTTAATTTCATTTCCCAAAACTGCTTCTCCAAATTTTATCA 

TATTAATCCCTTGTAGCTATTTTATTTAAAATTTAACAATTTTCCACTCGCATCTCTATA 

TACTCCCCGAACAACCTTTTTAGAAAAGGTTGATCAAAACTAAATATCAATACCTTATAA 

TGTTAATAATAAATCTTCTTACCGCTTGCATCTCTCTTATACTTCCCAAATTCTCTAATA 

AATTTCAACTTATCTCCCCAAAATACTGGCCCATCCTTACAAACACAAAGTCCCTCATCA 

TCTACACAACACTGCCCACAAATACCTATACCACACTTCATATACCTCTCCATTGAAACC 

TGAACTGGAATATTATATTCATTTGCTATTTCTACAACCTTTTTCATCATTATTTCTGGC 

CCACAAGTTATAATTAAATCAAATTTCTCTTCTTTAAGGACTTCTTTCATTTTTTCAGTT 

GTAAAACCTTTAAATCCAAAACTACCATCATCTGTGCAAATCTCTAATCTGCTAACTTTT 

TCAAATCTATCCAAAAATAATAACTCTTCTTTAGTTCTCGCCCCTAATATGGTTGTTATT 

TCAATTCCCTGCTTTGAAAATTCTTCAACTGCTGTTATAATTGGTGCAGCTCCAATACCT 

CCAGCAACTGCCAAAACCTTATCTCCTATTGGCTCAAAATATGTTCCATAAGGCCCTCTA 

ACTCCTATTATATCTCCTTCTTTTAGTTCATGCATTTTTTTGGTAAATTCTCCAACTCTT 

GCAACACTAAAACTATTTTTAGAAGAAAATCCAAATGGTTTTTCATCAACTCCCGGAAGC 

CAAAGCATTGCAAACTGTCCCGGCTTAAAATCAAAATCTTTATCTACTACAAATGTTTTT 

ACTGTTGGGCTTTCTTCTATTATTTCTTTTATTCTACATATAACTGGTTTTTCCATAATA 

TCACCTGAATTATAAAATTTTCTATTAAAAACTAAAAATAAAATAATAAAAACTAAAAAC 

TTA7VATTTATTAAATACTTTTACAAATCATTTATTGTTCTAACGACTTTTCCTTTTTCTA 



wo 98/07830 



-167- 



TCAATTTTACGATTAAATCTAAGCTAACTGGTTTATAATTAATAACTTCTACAGAAACAT 

TAATACTCTTCCTTTTGGGATTAATAAATGGATATTCATCTAAATGGTTTGCGTGATGAT 

GCCCATGAATTATCCAACCATCGAAGTTTAAAGTATAAGAGCTGTCTGGATTATGAATTA 

GCATGAATTTATAGCCGTTATATTCAATAACTCTAAACTTCTCACCAAACTTGTCATGAT 

TTCCTCTTATAAAAACAATCTCCCCATTTAACAACTCTAAAAGTTCTCTTGCTTTCTTTG 

CCTTATTTTTGCTTAAAATCAAGTCCCCTAAAAAATAAACAATATCCTTATCCCTAACCA 

CATTATTCCAATTTTTTATTAGAGTTTTATTCATCTCCTCAACATTTGAAAAAGGTCTAT 

TGCAGTATTTTATAATATTTGCATGGTTAAAATGCGTATCAGAGATGAGGTAAATTTTTC 

TCATAGACATCCCACAAAATTATATAAATTATTTAAACCATGCATCTAATGTTTTTTGCT 

TAGTTTTGTTTGCAATTAAGTTATAGAGTTTATCAACATGCTTTTTAACCCTATCATAAT 

TAAAGTCATTTTCATCAACTAAGAATTTTATAATTCCCTCTTTATCTGGCAATTTTAGGC 

TTAATGAATAGTTATCGGTAACCTTTGGCTCTTTAAATATCCTCTTAATCTCATCGTAGT 

ATTCAACCTCTTTTTTCAAAACATCCTTAGCTACACCACTTCTAACCAATTCATAAGCCC 

TTTTAAATCCTATTCCTTTAACTCCTCCTGGATTATAGTCAGTTCCCATAAATATGGCTA 

TATCTATCAAATCATCCAAAGAAATTCTTAAATCCTCTAAAACCTCATTTAATTCAATAA 

GTTCTGGCATCTCCTTTGTAGTTGTTAAATTTCTAACAACTCTCGGAGCTCCATATAACA 

AGGCATCATAATCTTGACTTACAACTGCCCAAACATCTCCCTTCTTTGCCATATAGCTTG 

CTTGTGCCTCTCCCTCAGAGGGAGCTTCAACATACGGAATGCCCATCAAACTTAACAAAT 

ATTTGCAGTTTTCAACCATTTTCGGAGTTAGATAGCTAACCCTCTTTGCATACTTAGCAG 

CTTCTTCAAAATCCTCCTTTTTAATTGCCTCTTTCATCTTAAGTTCAGCTTTCTCTTTCA 

TCTCTCTCCTAACTTTCCTTGTTTTCTCCTTTAACTTTGGTGGCTCACCATCAAAAACCC 

AGATTGGAGTTATATCATTCTCTAACAAATGTATGGTTTTATAAAAAACTCCGTTATATG 

CTGAGGTTATCTCTCCTTTTCTATTTCTCAATGGAGAACCATCTCTCAAACGTATAGATG 

TTAAAAACTGATATAATGCATTCATTCCATCAATAGCTACTTTTTTCCCTTTTAAATCTT 

CAAAGGAGATAATATTTTTTGGAATAAAATCACCAAACTGCACTCCCATGTTATCCCCTA 

CATTTAATCTTAACTAAAAATTATAGTGTTTTTCAAAATTAATAAAATTTATTGATAAAG 

ATTTGAACGCCTTCCAAAGAAGGAGTTCATTAATACCTTAGTTATTTAAGAAGTTTGAAA 

AACACTATATAACTGCATAAAAGATATTTATAAAAAACGGTTTAATTTTTTAAATTTCTA 

TAGAAATCCATAAAAATAGACAAAAGTTAAAAATTATTGTGAATACTGCTCTGCTATATC 

TCCAATTACTGGAAGCTTAAACTTCTCTCCTTTGTATGCCTTATACATACACACAATCCA 

CAAAATAAAAGCTGCCAAATTTACCAGACCACTTAGCATCCATCCATAGGGTATAAATGC 

CAATATTATTGATAAAACCCAAAGTCCTCCGAATAGTATTATGGATTGAACTGCATGAAA 

TTTAACAAATTTACTTTCCTTTTCTAATATATAGAACAATATTCCAGTTATTACTCCAAA 

TAGATAACATAACGCTCCTTCAATATTTTCATCTAAACCGAGTGAAGTTTTTCCCATAAA 

TATCACCTATATATACGTAAATTTTTATAAAAAGGATGAATTTTATTGTGAAGAGTATAT 

CTTACCTTTGTAGTATCCAACAACGATTTCATTTGTATCTGGATATAAAATTATTTGTAT 

TGCAGATTTATTGTCTTTTGAAACATACCATAACACCATTCCTTCTCCAGATTGCCCTCC 

TCCTGCTATTCCTCCACTACTTTCAAAATATCCAGATTTTTTTATTGCCTCATTCAGCTT 

TTCAAAATCATTAGTGGTTATTTTTCTCTTTGGAACATAAGTCAATACAATAGACTCTCC 

TTCATTTTGCTTTCCTGTTGAAACATATTCCATTAATTTAACTTCTCCAAACACTTCATT 

TAATATTGGTCTAATTTTCTCATCAGCTTCTTTTGCAGTTCCTATTGGCTGGACATCCCT 

TATTGAATTGTAATCAACCCCTTCATCTTCATTTTGATATTCTTCCTGGTTTTCATTTTG 

TTGTTGTTGCACTACCTGTTCTTGCATATTTTGAATCTCTTCAACATTCTTTCCTCCAAT 

GCATCCGCTAATGGTTATGCCACATCCTAAAACACTCAAAAATATTAAAAATATTAAAAA 

TTTCCTCATAGTCCCACCGTAGAACTTTATAAAAATTCTTATGCTTGTCATGCTTATATA 

TGTGAAAATATTATCCAGCTAAAATATTATAAATAAGTAATTTAATTTTTTAAAGTTATA 

TAAAAGGTAAAAATTTTACAAAAATAAAAATAGTCCAATTTATCTCCCATTACTCATAAG 

CTTTTCCTTCCAAATCATGTCAATATCTACACTACCTCCTTGGAATTCACCAATATCT^^ 

TATACTACTATAGGTTTCTTCAATATCTCCCTCAATCTCTAAATAAGCCCTTTTTAGCT 

CTCTATATTCCCTTCAGTTGTTTTCCACAATCCTCTTTGATAAGCCTCCAACAATCTCCT 

TGCAATCTCTTCTAAGGCATAGATGTTGTGTTCCTTAAAGAACTTTCTATTCTCTTCATT 

TTTCACGAACGTATTAAATATCTCATCAAATATCCAATTCTCAACCTCTTTTGTTGTAGC 

ACTCCAGCCATAAACTCTGCCAATTCTCTTGGCTATATCTCCAGCTCCTTTGTAGCCATG 

CCTCTTCATTCCCTCAATCCACTTTGGATTTAAGAGTTTTGTTAAGCTAACTCTCTCM 

ttcttcttttaaagttcttacttcaacattgtttggatttcttgtatctcStaaS^^ 
cttaacctcttctccttttaaaacccttgcggcatttgttaaacctccatSg^^ 

nl^SS^^^^^^^^^^^^^'^^^^^^^^^TCTGTAACAACTTTATTAAATGTTAAATC 
AACTGTCTTTAATATATTTTCAAATGCATTAATCGCCTTCTTTCCATAGACATCCTTTCC 
ATAGGCATAGGAGTTCCAGTAGATAAATGCATCTTTTAAATCTTCATCATTTTCCCATGC 

TACAAAGTTCATCTCCAATGGCTCATCTAAGTTAGCAACTTTCATTATTGCCTCATCAAr 

aagctctatgcagtttgggaacatatcccttgttattccactaactctaatS™^ 



-168- 



AATCCTTGGTCTTCCCAACTCCTCCAATGGAATAACTTCTAAGCCAAC7U\CTCTCCCTCC 
TCTATAAACTGGCTTAACACCCAATAGATATAAAATCATCCCCATTCCTTCCCCATCAGC 
CCACATTATATCAGATGCCATCCAATATAGAGCTATGTTTTCAGGATACCTTCCCTCCTC 
CTCTAAATATCTATTAATTAATTTTTCAGCTAATAAAACCCCTACTCTATAAGCAGATTT 
CGTAGGAATTCGGTATGGGTCTAATGAGTAAAAGTTCCTTCCTGTTGGTAAGATATCATA 
GTTTCCTCTTGTTATCAGCCCAGAAGGCCCTGGCTCTATATATTTGGCATCAATGCCTCT 
CAACAAAGAGCCAATCTCATCTGATTTTTCAATTCTCTCATTGATATCCTTAATCTTCTC 
CTCTAATTTTTTATCTTCTATACTCTTTCCATTTAATACATCTGAAACTTTCTTCTTTAG 
GTTTTTATCTTTATACTCAAACTCCATAGGGGGAAGCCCCCTATTGGGATACCCCGGATG 
CATTGCCTCGCTTCGCTCGGCAATGCCTCTCCTTTTACTATTCATAGTATTATTCTGGAT 
GAATATCGCCTCTAAAATACTCTTTATAAACTCAACTCTCTTCTCTCCACTTGGAAGTTC 
TCCjWVGATATGCATTCCATCATTGCACTTCGAGTTCTTTATCATCTCTAAGATATCTCT 
TAGCTCATCAAATATCTCTTTAAAGTTCTCATGGATTTTCCCTTCTTTTTCAATCTTCTC 

aattttttctttaattttcaataaattggtttttttaacttcctcaactatcaaatgctc 
taactgatgccttcttgaagcatccatctcctttaaatactcctctatatagctatctaa 
tgtctccaactcttcataaaatgcatcaaccataactgtttgcatgtgatcaataatagt 
tgcatagcttcttctctttgctatagttccctctggtggattatctgaattataaatata 
gagatgaggaatatctccaatacagatgtctggatagcattcgttagataaaccaacgtt 

TTTTCCAGGTAAAAATTCCAAAGTTCCATGAGTACCAACGTGGATTATTATGTCAGCAAT 

gtcattaaaatatttatatgatgctatatattgatgagttggtgggcaataagggtcgtg 
taatatcttacaaactcttccatcacatcttgccccagcacatcctctttttggttgaac 
acaaacatagacattcccaaactttaaaccagttataactatcttattttttccattaac 
tttataaatcattcctgctgggatgtctttaccatttaaatctccccatgtttctaaaat 
tttattttttacattctctggcagtgtgttgaagtattcataatactcttcttcatccat 
taagtatagatatcctcctttagctataatctcatttacggtagtccatctaaactctga 
aattgccttcttctgcataattagctgagctaactcctctccattttctggaatattttc 

TACATAGTAGCCCTCTTCCTTCAACTTCTTCA'i'TATGTTTATAACACTTTGAAAGCTGTC 

taaatgggcagcacttcccacagttgcctcaacagatgcacatgcattgttatgcaatat 
aaatataacctttctatctttcttaggtttgtattttagctcaatccatctctttattct 
tctaacaactttgtctatcctttcctcaataccaaacttcttctctaagccgttctcatt 
ttcagtagttccaatgataatcggttctataaccccttcaaactctggcaaggctatagt 
ccaaccaatatctgcagataaaccttgctcatcttttttccaatcctcatagcttttata 
ataactcattattggatgaaatactggcacatctaactttttaagtatctctactccaga 
gattttgtttaaattagccttatcttttacagttcccaatggaaatgacagtagattgat 
taaggcgtctattattggcttatcatctttaaggaagtattttaaaacactctctccact 
acctaaggcatttaaatcctcacacttagctccataggaaaatactggaattacattgaa 
ttctttgtccaatctatttaatagcttctcaataacatccatatcatcattaactaaata 
atgccttgagaataaaatccccaccgtatattttttattaaactcaacgtcttttaaaaa 

TTCTTCTAATTCTTCATAAATTTTGCCTCTATAATAGATACCTTGGAATGGATGCTTTAC 

aacatctttatctttacccattagatataaaaccatatttttgaagttatctaaacctcc 
ataagttataaataaataacatttagcagatttttcagaattccaaaagtttgggtcttg 
ggcaacaactataacgttttcattgaacttctttatcttctctaaatcaatatcatctga 
tgatgttctataaataaaaacttuvatcataatcttttgcatcctctaaaaactcatcatc 
aattggatttctgttagaatatattttatattcaacatctactccttcttttttaagctc 
atccaacgccttttttaatattgagcaataagatgcccacatataaaatgtgattttcat 
aacaccaccgtaatattaataacttataataactactttagtgctaattttttgtagatt 

TTATTACTACATTATTACAATTTTAGTATTTATAATTTGTCATTAAAAATCATGATAAAT 

ttcataaaaaataaaaaattaaaattagtaaatagaagctccatcattgtttggtttagt 
taaaataacctttccatacctatttaatttttcttttaccttttcaacattttcatcttc 
aaccatggctatataacttggacctgttccagataaaccggctgttattgccccagcatc 
taatgcgtctattgctatgtttgttggaaagtttaaagctgatgcataaagaattccatt 
taaaaataaagctttgaaatagtttccatttatagcctcattaaaggcaatttcaacata 
atcctttattagcttcattctatttacatcaacattcttttctaaatttggaattaatat 
taagacgtttaaatcatctctcatcttatctctttttaaaatttttctttctatattgtc 
agttattgttattcccccatagtatgatgcagtagcatcatcataagctccagtaacagt 
taatttttcatcaaaacttgattttatccctaaatttaatattagctcatcatctatttt 
ttcccctaatgcatcaaatgttgccaaaacaactgcgttagaagtggctgaactactact 
ct^tccagattttataggaatttctgtctttgtttcaacataggcagagtaattcagccc 
aaaataatctaaagtatttttgacacatcttactattaaatttggcttaatgtttggatt 
atctaaaactttaccctctattttgttttttccatcatctataagtttaactttggcata 
aacctttaaatctaatccaaaagctgaacccttacctgttgctatagcgtttattattgt 
cccagatgctaatgcataggcttttccttccataaaaaatcactccattacttttgtagc 
tataaataaagtggagctgaacgaagtgaagccccactcattttgatgaacctttattaa 
aggttcatgataatgcataagttctcccttccataaaactccccttagttattgctccaa 
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TTCCATAAACCCACTCTATTTTCTTTAGCTTCCCTCTCAACATTTAAGAATTCATCCTTT 

AACTCAAAATTACTTATATAAACCCTTGCATATCCATACTTTAAAAGCTCTTCATTGAAG 

TTTATTAAATTATTACTATTATTTATAAAGATGTATGCTAAATATCTCCCATATTTATCT 

TTCTTTGGGGCTTCATTATCAAAGACAATTATAACTGTTTTATTTTTAAGTTCTTTTTCT 

GCAAAATGCTTAGCTTTATAGCCCCATTCTTTTAAGTATTTTGTATCTGTTATCGGTGTT 

CCATTTAATAAATAATATTCATACGGGTTGTTTCTCTTGTGAATTTCTGGAGTATCTACC 

CCTAAAAGCCTAATCTTCCATAATTCCCCATTAACTTCAACATAAACAGTGTCTCCATCT 

ACAACCTTAACAACCTTTCCGTAGTAGTGTTCATGAGTATCTACAAAAGAAGTGTAATTA 

TTATAACTCCAACTATCATGATAATAACCGTTAGAATTAGAGGATGAGAAATCAACACAG 

CCACATAGAGTTGTGAAGATTAACATAGATAGTATTAGGAATTTTCTCATAATCCTCCCT 

CTAATCATTTTAACCTAATAAATATATACTAAATACTTTAATACTTGCTATAATTGATAA 

TAAAACAACAACTCTTGTTATTTCATTTGAAGCTCCTAAAACATCTCCATTAACTCCTCC 

AAAATGTCTTTTGGCTATTTTAGCCATACATAAGCCAGTAATTATTGTCGTTATTATGGC 

AATAATAACTATCTTCCTTTCAATCCCACTGAATATTAAAAGTAATGGGAGAGATAAAAT 

AATACCAATTGTTAAAAATTTTTCATCTGCCTTTTTAACAAAGTATCTCCCAGTTCCTTC 

AATTAAAGGATTTCCAAAGGTTGAACAGCTTAGCATTCCAAGCTTTGCACAAACCTCTCC 

AACCAATAGATATAGGATATTAATGTCTAAAATATAAGATAATGATATGACTGCCATTAA 

ATTAAAAAATATTGCAAAAACTACTCCTCCACAGCCAATATATCTATCTTTCATAGCCAT 

TAATTTCTTTCTCTTATCTCCAACAGCCATCCACCCATCTCCAAAGTCAATTAAACCATC 

TATATGGTGGAATCCGTTTAAATATTCAATAAAAAACAAAATTAAAACAGCAGATAAAAA 

ATTGGGGAGCAAAAAACTAAAAATATAACCTAATATCAAACTAAAAATTCCAAACACATA 

TCCAATTAAAATAATCAGATAAAAATAGTTGGCAATGTTTTCAAAATCAAAATCTTCTAC 

ATAGATTGGAATCCTTGTAAAAAATGACAACAGTGCTTTAAATTCCTTAAACATTGTTAT 

CCCCAAAAAATTTTATATTTTAATCCTTTTTTAAATTTTTAACAATATCTACAAGCTCTT 

GAAGGGAGTTTATTGTGTAATCGCTATATTCATCATCTTCCATGTCTTTATATTTGCCCT 

TCAATATCCTAACTGTTATCATCCCCAACTCTTTAGCTGGTTTTATATCCTTATCAACCC 

TATCTCCAACATATACTGTTTCTTCTGCTTTTAAACCCATTCTCTTTAATCCATATTTAA 

AAAACTCTAAGTGAGGCTTTCCTAAACCAAATTCCTCTGAGGTTATAACATCATCAAAGA 

ATGGATGAATTCCTAATCTAATAAGCTTTTCCCATTGCTTTATAGTTAATCCATCAGTTA 

TAACCCCCAACTTTAATCCCATTGCCTTAAGTTCCATTAATGTCTTTATTGTGTGTGGAT 

AAGGCCTTAATAATGCTACTTTAACGTTATGGTAGGTTATTATTCCAGTAGTTATTATTT 

TTGGGTCATATTTTCCTAAAACAGCTTTAACTAAATCATCAAAATGCTTTCCATAATTTG 

AACCTTTGTCCTTAATGATTTTGTTTAATATGTTCATTGCTTCTTCAAAATCTATATTTA 

AACCAGCATCTATCATTGATTTAACTGCTTCTCTTCTTGCAATCTCTACAAATTCTGATG 

AATTATATAAGGTATCGTCTAAATCAAACAAAATTCCCTTTATCATATTTTATTCCCTTT 

TAGCATTTTTTACCTTCTCTTTGACCATTTGAAACAATCCAACAAAGTCATCATCCTCAT 

TAACAGGAACAACAATTAATCCTAACTTTTTATGCCTCTCTATCCAACCTTCATTATAAG 

CAGGAACTTCTATCTTTATCGGTTCATCTGTTGGATTCCCAACTATAACATTTCTGTATG 

GGAAGGGCTTTACTTCATCATCCTCTTCATAAGGAAAGGTAGTTTCTCTAATATATCCTC 

CTAAAGCCATTGTCATTCTTGGTAATAAAACCATCTTTGGCATATAATATCCCCCTTAAT 

CTACTTGCAAAAAATAATATAATAACCTCCCAAAATTTAAACTTAACTAAAATAGTTTAT 

ATTTATTTTTATAAAtTTATACATAATAAATAAAAGAGAAAAAAAGAATGGGGAAGTTAG 

TATTTAGTATTCTATGTAGTCAATAGCATGTTTAAATACTAATAAGTTCCTGTCTCCAAC 

TTTTACCATTATTTCATAATTAGAGACTCCTGTAACTTCAGCATCTAAAACTTCCCCATT 

TCTTAAGAATATCTTGACCTTCTTCCCATTTAATCTTCTTGCATATTCAAAGTTTGGGAT 

GACTTTCTTTGGTTGCTGTTTTTTTACTGGCTTATTCATCTACTCCCACCTTTTGACATC 

TTATAAATATCAACCCTTATAATTTTCAATCACATATATATACTTTTTTAAAGATAGCAA 

AAAATTACTTTGAGAGGCAGAATTCTTTAATTGGACAGTTATCACATAATGCCTTTTTCC 

TACAGAACTTTTTACAGTGCTCTACTATTAATGCGTGATATTCTTTGTATATTTCTAAAT 

CTTTTGGTAAATTTTTTTCAAATATTTCCTTAATCTCATCATATTTAGCTTTTTCGTTAA 

TTACTCCCAACCTACTAAACATTCTTTTGGTATAGGCATCAACAACAAAGCTCTCCCTAT 

CTAATGCATACAACAAAATACTATCAGCTGTTTCCTTTCCCACTCCATTTATTGATAAGA 

GCTCAGCCCTTAATATTAAAGTGTCTTTATCTGTCTTAGCCATCTCTTCTGTATTTCCAT 

AATTTTCAACAATAAATTTAGTTACATTTTTTAGACGCTTAGCTTTTAAATTATAAAATC 

CAGCTGGCCTTATAAGTTCTTTTAGTTTATCTTCATCAACATTTAGTATTTTTACTTCTT 

CCAACAAATCTTCCATCTTTAGATTATTTATAGCCCTCTCTACATTTTTCCAACTTGTAT 

TTTGAGTTAAAATTGCTCCAACGACAACCTCATACCTTGTTTCGGCAGGCCACCAATTTT 

GATGTCCATAATAATCTAATAAAATTTTGTATATTTTGTATATCATCTCAAATTTGTtCT 

CTTTCATTTATCATCCTCCTCTATAATAATGGACTTATTATCAAATGGATTTGCCCTTAA 

ACATAATGAATACAGATATGGATAGTTATCTTTTAAGTGCATTAGGTAGTTTAGCCATTG 

AATAATCAATAATTTATAAACTCTAATAATATCGTTTTTTAAATGGTCTAAATCACTTTT 

TGGAAGGTTGCTTAAATCCTCTCTTCTATGTAGCTCATCAGCTAAATGAAATACTGCCAA 

TAGTAATTCGGTAAAGCTTTCATGCTCCAATAGCAAAGGATTTTCCATCAATCTTAAAAG 

AAATTCTTTATTTCTCTCTAATAGATTTTTAAGCTTATATAAATCAATTTTTTCTATATC 
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TATGTTACAATCATAATTCATTAATAATTTTTTTGTTTCTTCGTAAGTTTTATCATTCCA 

TTCATCTGATATTTTTAAGTAATCCCTTATATTCCCAACATCTCCTTCTAAGATTATTTT 

TAAAAGTTCCTCTCCAACACTATTAAAAAAAGAACCAACGACCATATTTAATTTTTCCAA 

TATCTTCTTTTTTTCCCTATAATCTAAAATTTTCTCAATGATTAAACTTACAAGCAAAAC 

TTCAuATAGGAACAAATGCCAAATGTAATAAAAAATAGCTTAATATGTAATCAACTTTTCC 

AAAGATTAAAAAATGTATTGAATAAACCAATATAGATTUVAAAAATTAAACAAATAGCTAT 

TATTAACATATACCTTTTATCATTCATTTTTATCCCTTAAAATTCCAGATTTTGTCTCCT 

ACATAGTGAAGGTTTTTAATAGCTTTTCCTTTATCTGCTTCTTTCATCTCTTTTCCATTC 

ATTAATGCAATTCCAACACATATTGGCTTTTTGTGGTTTTCATCTACCACAAAAACAACA 

TCCTCCTCTTTAATATTTTCATCTGCATCTACAATTCCTGGAGCCATTACATCTGCTCCA 

TTTATTAAAAATTTTATAGCACCTATATCAACAACAACTAAATTTTTATCTGGGAGGGAT 

TTTAATAACAATTTTAATGTTGGAATTACTTTATCATCTTTTTTAAATGCAATTGGCTCT 

TTATCGACTAATATTATCTCAAAGTCATCAGTTATAGCTATCTCCACATTTCCTTTTTTT 

GGGATTATCTCATCAACATTTTCAAAAAACACTTCCAATTCTTTTTTTATTTTTTTAACA 

TCTTTTTTACTTAAAAAATATCTTTTCCTTATTTCCAACCTCTCACCTTTTATAATAAAT 

TACTCTAAAATTTTATACAAAGTAGTTCTCTCCTTAGGAATTAATCCAACCCTTTTAATC 

ATGTCTCTAATCTCTTCAACACTCATATAAACTCCATGCTCAGCTCCTGCACTTCTTGAT 

ATACTCTCCTCTATCAAAGTGCCACCAACATCGTTAGCCCCACATCTTAAAGCAACTTGA 

ACCATCTTTTTTCCTAATTTAACCCATGAAGCTTGGATATTTTTTATCAAACCCTTAAAT 

ATTATTCTGCTAACAGCAAAAACCTTTAAATCTTCAATTCCAGTAGCTCCAGCTTTTGCC 

TTTCCTTCTTTATAGATTGGAGCATATTTATGCATAAATGAGAGTGGAACAAATTCAGTA 

AAGCCGTTAGTCTCTTCCTGAATCTCTTTAATTATAAAAAGATGATTTACCCAGTGTTTA 

TATTCTTCGATATGCCCATACATCATTGTTGCAGTTGTTGGAATGCCTAATTTATGAGCC 

TCCTTAATTATATAAATCCACTCTTTAGTTTTTATTTTATTTGGGCAGAGTTCAGCTCTA 

ATGTCATCATCTAAAATCTCCGCCGCAGTTCCTGGCATGGAGTTGAGACCATTTTCTTTC 

AATATTTTCAATGCTTCTTTAATATCTAAGCCAGCATTCTCAGCACCAAAATAAACCTCC 

ATTGGAGAAAAGGCATGTATGTGGATATCTCCGTAAGGTTTTGTTGCTTCATGCACAGCC 

TTTAAAATCTCCGCCTGATAATATGTATCTATCTTTGGATGCAATCCTCCCTGAATACAA 

ACCTCAGTGCAACCAAATTTTTTTGCTTCTACTGCCCTCTTAGCAATCTCATCTATATCT 

AAAAAATAAGCATGTTTGTCATTTTCATTGGCTCTGAAAGCACAAAATCTGCAATTTCCA 

ACGCATATATTTGTGAAGTTTATATTTCTATTTACCACGTAGGTAACTATATCTCCT^CT 

TCCTCTCTTCTCAAAGAATCTGCAAATTTAAACAACTCAAATATAATCTCATTATCTTCA 

AATAACTCTAATGCTTCTTTTTTGGATATTTCTTTCTCTCTAAATTTATTTGGGTCCATA 

TTCTCATCTCTTAGAGTAGTATTCTCCAGTTATTTTTAGTTCTCCATCTTCTATTTTATA 

ATGGAAACATGAATAATAACCCTCATGACATGCAACCCCTTTCTGTTCAACTATAAATAA 

TAAAGCGTCTCCATCACAGTCCCTAT/^AAATTTTATTAATTTTTGAACATTTCCACTCTC 

TTCTCCTTTTCTCCATAACTTTTTTCTACTTGTTGAATAATAATGCATATATCCAGTTTC 

TAATGTCTTTTTTAATGCCTCTTCATTCATAAATGCAACCATTAACACATTTTTATTCTC 

ATCACAGGTTATTGCTAAAATTAATCTCTCTCCTTCTATATTTCTGAATTTTAAATTTAG 

TTTTTTAACAGTATCTTCCACATCCATGAAATCACCTAAATATGGATATTAATTAGGACT 

GAAAGTCCTAACTTAATAGACGGGTGGTATACCAATAGGAGGTTTCCTCCTATGGTTACC 

AATCATCTAAACCCATCCATTCTCCGACTATGTTTATATCCTTTCCTCCATATACTCCAT 

CAAATTTGTCAATTTTGACTATTGATGGGAGGGATATTGCTGCTCCTACAATTACTGCTT 

CTCCAATACCCAAAGATGCCAAATCTTTTACCAAATCTTCTCCAAGTTCTTCTGAAGCTC 

TCTGTATATATTTTTGGTCTTCTGGTTCGACTATTTTTAAAATTATCTTAGTGTTCGTTT 

GAGATAAAACATCAGGATGCAATTGCTTAGGTCTTTGGGATACTAAACCTA7VACCAACAC 

CAAATTTTCTTCCCTCTCTTGCTATCTTCCCCAACCATAAGCTTGCTGAGTTTTGTTCAT 

TTACTGGAATAAATATATGAGCTTCTTCTACAATTAACAGGACAGGTTTTGTTACAACTT 

TGTAGTGTGATTCAATAATGTTTAAGTTTGATTGTGCAACTCTTCTAATTTCCTCATTTVA 

TACTATATACATCCTTTAATGATTTTAAGTAAGTTATCCTTTTTAAAAGAAGATGTTTAG 

CTATAAATCCCACAAAAGTAACCATCTGAGGAATCTCCAACCCACTTAAATTAACGATGT 

TTATTTTTCCAATTTCAAATTCTTCAATTACATCTCTATCCCCAATATTTAATGCATAAT 

CTAATTTGAATTTGCTAATAGTATCAATGAGAGACATCAATATAACGAAATCTTCCTTTT 

CTAATTTTCTTCTATCATAGTTCCTTCTTAATGGGTTGTAATATTTAATTTCCCATCCAA 

CTGATGCTATCTTACTCCATTCATAGAGTAGATTTTCTATTTTTTCAATAAACTCAATTC 

CTTTAGCATCTGGACATTCATGTTTTACAGTGTGGTATGCAAATTCCACATAAACTCTCT 

TCTCTATCTCATTATCGCCTATCCCAATTAAATTAGCAAATTCACTTGGAGCTAATAAAA 

CAGGGTTTATTATTGGATTTATTACCTTTATTTTCCCCTCCATGTCTTCATGATATAAAG 

AGATATACTCTCCATGGGGGTCTATCATTATTACAGTTCCATTTTTCTTTGCAAGTTCTC 

TGCACAAAACAGATGCGGTATTTGATTTTCCCCCTCCAGTTATAGAGAGTATTGCAAAAT 

GTCTTGATACAAGTTTATTTGTGTCTAAATAAACTCTAACATTATCTCTTGTTAATAAAT 

GACCTATATTCAACCCATCTGGAGTTAGATATATATTATTTAGGATTTCATCATCACACA 

ATCTAACTTCACTGTTTGGGAGTATTGGTGTTCTATTGGGAATTATTTTGTTTCCATCCA 

ATACACCAATGACTTTAACTTCACCAACAAATTTCTCAACATCTGCAACTACATTTTTTA 
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TAACACCCAATACATCTCTGCCATCAACATTTTTTGCAATTACATACTCTCCAAATCTTA 

TCTTTTCAAGGGATTCAAAAGTAAAGTGTGTTGTTGTAGTCTTTCCTACAACCTTCATAA 

CATACACCCAAGTTTTTGGAGAAGCTCTTTTATTTGCATTAAAACGCTCTCATCAGATTT 

ATAATCATCAATTAATTCACAGTATCTGTACATCTCTAATGATAACTTGGACTTTATATC 

ATTGACCATTTGGGATATTTCATCTAAGCTTATGTATGATTCAATAAATGATAGATAGAT 

ATATGCTCCAATAGGATTATTATTTGCCATATACTTCCATAGGTTTGATGCATAGAAACA 

AGTAATATTTGGTTTGTGATACATTACAGGAATATCAAATAAAACCGAAGGTATTTCAAG 

ATTTGTAGGTTTTCTACTTCTTAATTTTTCTAATATATTGTCGATTTCACATGTAAATTG 

CAAAGAATTGACAATTTTCAAAATTTTTGTGTATGTTGTTATCGCTTCATCGATTTTTCC 

AGATTTTTCAAGAACTATTGCAAGAGATTTTAATGTGCTAATATCTTTTGGATTCACTCT 

TAAAGCATTCATAAAAGATTCATACGCATCCTTGTAGTTCCCTAACTTGTAATAAATATA 

CCCTTTGACATACCACCATCTACTGCTGTTTTCCAGATTTATTAAATTATCTACATATTT 

CATAGCCTCATGATAGTATT^TTGGGATTCCTGTTGTCTCAGCTTTTCCAATTAGGGAGTA 

TAAGTTGAGTGCGCTAATATCTTCCTCTGAGAGTTCTTCAAGGTATGTTTCTAAGACTTT 

ATCAAAAAAGTTTGAGGATTTTTCCAACTCTCCTTTTCTGTAATAAAGTATGGCCAATCC 

AAAATATGCATATGGATTTTGTGAATTTAGCTTAATTGCCTTGTTATATGCTTCTATAGC 

TCCCTCTTCATCTCCAAGCTTGTATAATATATCTCCTTTTTTTGCATAATATAAACTCCT 

ATTAAATATTGATATCGATTTATCAATATATTCAAGGGCTTTTTTATATTCTTCAAAGAT 

TTCTGCAATTACACTTTTATAATAGTAAGAGATGTCAGAATTTTCATCTATCTGAAGCAC 

TTTATTAAAAATTTTTAATGCACCAATGTAATCTTTATTTTTTATCATCGTAAGTCCATT 

ATTTAAATCTTCATAAGAGTTTATGGCATTTACAACGTTTTCCATACATTCAACAATTTT 

TTTACACTCATGGCTTGGATTTTGTTGTAATATTTCATTAAACGCATCATATGCTTTATC 

AATGTCTCCAAACAGTAGATAAATCTTTCCTGCCTTAAATAACGCATTGAGATTTTTTGA 

ATTTGCCATTTTATATGATTTTAAATAGTATTTTT^AAGCCTCTTCGTATCTACCATATTT 

AACCGAAATATCTCCCAAAATTTCAAATAATTCCTCATTTTTAATTTTTTCAGAGGCCTT 

TTTAAGATATTTATATGCTAAAATTATTTCCCCACGTTTATAGTGAATGAGTCCTTTAAG 

ATATGCAAAATATATGTTTGATGGAGATATTTTTAACGCCTCATTTATTGCCTCAAGTGC 

AGAGTCGTATTTTTCTAAATGGTAGAGTGCATAAGCAAGATTAAACCAATCAATAGGATT 

GGTATTTTTCTTTTCTAATGCTTTTAAGTAGCATTCAACTGCTTTGTCATATATTCCTTC 

ATCTAAGTAATAGTTAGCCTCAGTAACCCAATCTTCATAGGATTTAAGTTTTTCACTTAT 

CTTTCTGAACAAGTTCATTGTGAATCACCAAATTTTATGCCCCACATTATTTGCTTTAGT 

TATAATTATTTCTCCATAAACTCCATATTTATCAAAAATATCGTTGGCTTTTTCAACAAT 

TAATTTTTTGTCTCCAAATGCATAAATTGTTGGGCCAAAGCTTGTyU^GTCCTGCATAAAC 

ATCTTTATGCAATTCATTAATTAAATCTTTAAC/VATATCTGATTGTAAAGAGAGTTCAAC 

TTTTTTAAAGCCTAAGTATTGAAGCTTGTTGATAACTTCTCCAAAATCATCTAAATTTTT 

TTC7VACAACTGCTGGCATCATCTTCATTAAAACTAAATGGCAGATTTTTTCAACTTCATT 

TAAAGGAACTGGGCAGTATTTTTTAAATATATCCACTTCTTTTTTTCCATAGACATGTTC 

TCCTTTTGGAATTATTAAGATAGTTTCCCAATCAAAATCATGTCTAAATATTATTGGTGC 

TGGCTTAACTCCTTTTGAAGCAGATGAAGGTCTAAAATCTTCTTTATCCTTACCCTTGCC 

AAAACTATGCCCTCCATCAATTAAAAATCCTCCATACTCAAAAGCCCCTATTCCAATGCC 

TGAAGTCCCTCCCCTTCCAGTAATTTTAGCAATATTGTAGGCGTTCATTTCTTTATTGTA 

TATTTTTGATATTAATTTACCTACAGCCAAAGATAGCTGTGTTCCACTACCAAGACCAGA 

ATGGGCTGGAAATAGTGATAGGATTTTTAAATCAACTCCCTCTCCACCAATAACATCTAA 

AACTTTGATAGCTGTATTATATACTCTATCTCTAACAGATTTTATATAATCTTCTCCATA 

CTTTTCAATCAATTTTTTATCAAACTCAATGGATATATCATCACTTTCTTTTCCTTCAAT 

TTTTATATTTGGCTCCTCTAAAGCCAAACCAATACCTCCATCAACTCTTCCAATAGAACC 

ATTCAAATCTATAAGCCCCATGTGAATCCTTGATGGTGTTTGAATTATCAAAATCTCACC 

ATTATTAAGGTTTTAAAGATAATAACAATAACAACCAGATGTTCTATAAATTATAAATAT 

TTACAACAAAAAATAAAAAGTTTGAAGCTTAAATTAATGCCTCTATCAAATCCCCTCTTG 

TT^CAATACCAATTAAATTTCCTTCATCATCAACTACTGGCAATCTTTTGATGTTATTTT 

TAACCATCAACTTTGCTGCATCATTAATTGTCATATCTGGCTTAGCAACAATAACTTTTC 

TTGTCATCACATCCCTAACCTTTGTTTTTAATGCATTTTTTAAATCTTCCATAAATTCCT 

CTATCTTTAAAGCTGTTTTTAGTGGAAGTTCAATCAAATCCAATGGTGATGGTAAAATGA 

GATTTAAATCTTCATTATGTGTAACAATGGTTTTCACTATGTCACTCTCTGAGATTATTC 

CCACTAACTTACCATCTTTATTTAATACTGGGGCTCCACTTATCTTATTTTTCCTAAATA 

ATCTTATTACATCGATTAAATCATTATCCTCATAAACCACAATGGGTTTTTTCATGATAT 

CTTTTATTAACATTATTTCACCATTTATATTTAATTTATTCAATATAGTCCTCAATATTT 

AATCCCAACTCATTACAAATTTCTTTTAATTGGTTGTATAAGTTTTTATCAATTTCAAAT 

CCATCCTTTCTTTTCATTTTATTCCTCTCCTCTATTTCCCCAGGGATTAATATCTCAAAA 

CCTTCTGCTGGCTCTGAGTTTTTAATTTCATCTAACAACTCATCAACTTTTCTTTTAAAC 

TCCTCCTTCCCCATAAAAAATTCTGGATTTATAGCTATAAATAAATCTCCCTTAGTGCAT 

CTCTCCTCTGGATTAGCAGTCCCTTTAACCTTAGTCCCAACCTCAGCCCCACCGATAGCT 

GACAGCATTTCGATAGCTAATGCCAAACCATACCCCTTAGGTCCTCCAAATGGTAATATA 

CATCCTTCCAATGCTTTAGCAGGGTCTGTTGTTGGCTTTCCATCTTTATCTACTGCACAA 
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CCTTCTGGAATCTTTATTTTTTTTCTTAAAGCTTCTAAAATCTTTCCTCTTGCAATTGAA 

GCAGTAGCCATGTCTAAGGAAAATTTATACTTATTTCCTTTAAATGCTATAGCAATTGGA 

TTTGTTCCTAAAATTTTCTCTTTACCACCAAAAGGAGCCATAGCTGGCTCTGTGTTTGTT 

ATTGTTATTCCAATCATATCTTGATTCATAGCTAACTCTGAATAATAGCCAGCGATACCA 

AAGTGATTAGCATTTCTTGTAGCAACAACTCCAACTCCAACATTTTTTGCCTTTTTTATA 

GCTAATTCCATGGCTTTTTTTCCAACAACTTGACCTAAACCCAAATCTCCATCTATAACT 

GCCGTTGCTGGGCTTTCTTTAACTATCTTTATATCTGGCTTTGGATTTATATTTCCTAAT 

TTTAAGGCAGTTATATACTGTGGAAACCTTCCAATTCCATGAGAAGTAAAACCCTTTAAA 

TCAGCATCAACAAAAACATCGGCAGTTATTTTGGCATCTTCCTCTGGAACACCAAATTTT 

TTTAAGACATCAATTATTAACTTTTTTTCATTTTCTGGTTTTAAAATCATTATATCCCTC 

CAAAAATTTTTAATTTTATGGTTTTACATAGGTCATGTTATAATAGACAATATCCCCATC 

TGCATCAACGATTGCTATGAGTAATTTTTTTCTAACTGAGTGAGCAACCCTAACAAACCC 

AGTTAGCTCACTTAATAGAAAAGAGCTATCTTCAGGAAATACCTTAACCAAATAAACAGA 

GTGTTCTTTATCTATGTTAGCTCCCCTCTCATAAAGCCTAAAATCAGCCCCATACTTCAA 

ACCAGTCTTTACTATATAACCTCTTGTTCTTAAATCCTTATAAACTAAATATTTTAAACA 

TAGTCTTTCTTCAACATTTCTCGCATATTCATATAGTTCTTCAAAACTTAGAGGTTTGTT 

ATCTTTATATTTCACTTCCAACCATCCTAAATTTATCAAATAGAGGGCTTCAACTAMGA 

TAGAGATAAAAAATTCCCTTCAACATTTCCATAATGCCTTGCTGATAACTTAGATATCCC 

ATTTTTGTCAAACACTATAACTCTATCTCCATCCAACAATCCAGTTATTTTTTTGCCCAT 

TTTATCTCTCACCAAAGTTATTATTTATAAAATCTTAAAATTTATTGTGGATAATAAAAT 

AAATAACATATGGTTTATGTTATTTAACAAAATTAATGAATGAATTAATATAGAACTTCG 

CAGTTTTTATATTAAAAAGGTATTTAGATGCCTAAAGGCATCATTATTCAATAAATCATT 

TATTCCTGCGAAAGTTCTATAATATTGAGGTGAATCTATGATATTCCATCCAAGACCTTC 

ACCAATAGCTGCTGCAATGTATCAACTTAGGGATTTGGGTGTTGATGCTATAATTTTACA 

TGGTCCAAGTGGTTGTTGTTTCAGAACCGCAAGATTATTAGAGTTAGATGGAGTTAGAGT 

ATTTACAAGCAATATTGATGAAAATGCTATTGTCTTTGGAGCTTCAGAGAATTTAAAAAA 

AGCTTTGGACTATGCAATTGAATATTTAAAAAAAGAGTTAAAGAAAGAGAGGCCAATGAT 

AGGCATAGTTGGGACGTGTGCAAGTATGATTATTGGTGAAGATTTGTGGGAATTTGTAGA 

TGATGATAGAGCCATAATTATCCCAGTTGAGGTGCATAGTGGAAGCGGTGAT7VATACAAT 

AGGGGCAATAAAGGCTATGGAGTCAGCTTTAAAATTAGGAATAATTGATGAGAAAGAGTT 

TGAGAGACAGAAGTTTTTATTAAAAAAAGCTACTGAAGTTGAGAAAAAAAGAGGCATGGC 

AAAGAAAGAGTATATAAAGCCAACTTATGATGATGATTTAAATGAAGCTATAAAAGTTTT 

AAAGGATTTGAAAGAAAAAGATGGGAAAATAGCATGTGTGTTGAATGCTAAAA7VAGAAAC 

TGCCTATTTGTTTGCTCATCCTCTAATTGTTTTAAATAAGTACTTTAACTGTGTAAATAT 

AGCAAACTTAGATATAAATAAGGGACTTCCAAAGATAAGAAGAGATGCACAAAATATATT 

AAGAAGGTTTAAAGCAGATTATATTACTGGTGGGTTAGATGAGTATCCAATAACCGGAGA 

GAGAGCAGTCGAAATATTAAAAGATTTGGATGTTGATGCTATTGTTGTCTCTGGTGTTCC 

TCATGCTTTACCAATTGAAGAGATAGATAAAGACATAATAAAGATAGGCATAAGTGATGG 

ACCAAGAACATATCATCCAATAAAAGAAATTTATGATTACGCAATTGTTGAATTAGATGC 

ACATGCGAAGGTTTTAGGGAAAAGAGATATTGTAAAATCAAGATTTGGAGAAATATTGGA 

TTATGCATTGGAATAAAGTTTAAAAATTATTAATCCATAAAAAATTTTGGTGATAATAAT 

GGAAAAACCATGGGTAGAGAAGTATAGACCAAAAACATTGGATGATATTGTTGGACAGGA 

TGAAATAGTAAAGAGATTAAAGAAATATGTCGAAAAAAAGAGCATGCCGCATTTATTATT 

TAGCGGACCTCCAGGAGTTGGAAAGTGCTTAACAGGAGATACAAAAGTTATTGTAAATGG 

AGAGATTAGAGAAATTGGAGAAGTTATTGAAGAGATAAGCAATGGAAAATTTGGAGTAAC 

TTTAACCAACAACTTAAAAGTTTTAGGAATTGATGAAGATGGAAAAATTAGAGAGTTTGA 

TGTGCAGTATGTCTATAAGGATAAAACCAACACGTTGATAAAAATAAAAACCAAAATGGG 

TAGGGAGCTAAAAGTAACAACTTACCATCCACTTTTAATAAACCACAAAAATGGAGAAAT 

AAAATGGGAGAAAGCAGAGAATTTAAAGGTTGGAGATAAATTAGCAACACCAAGATACAT 

TTTATTTAATGAAAGTGATTATAATGAGGAATTAGCAGAATGGCTTGGGTATTTCATAGG 

AGATGGGCATGCAGACAAAGAATCAAATAAAATAACCTTCACAAACGGTGATGAAAAACT 

TAGAAAGAGGTTTGCAGAACTTACTGAAAAGTTGTTTAAGGATGCAAAAATAAAAGAGAG 

AATACACAAAGACAGAACACCAGATATTTATGTTAATTCAAAAGAAGCTGTTGAATTTAT 

TGACAAGCTTGGTTTAAGAGGAAAGAAAGCAGATAAAGTTAGAATTCCAAAAGAAATAAT 

GAGAAGTGATGCATTAAGGGCATTTTTAAGAGCATACTTTGATTGTGATGGTGGTATTGA 

AAAACACTCAATAGTTTTATCAACTGCAAGTAAAGAAATGGCAGAGGATTTAGTTTATGC 

CTTATTAAGGTTTGGAATAATTGCAAAATTGAGGGAAAAAGTAAATAAAAACAATAACAA 

AGTATATTACCATATTGTTATCTCAAACTCTTCAAACTTAAGGACATTCTTGGACAACAT 

TGGATTTAGTCAAGAAAGAAAACTTAAAAAGCTCTTAGAAATCATAAAAGATGAAAATCC 

AAACTTAGATGTTATAACTATCGACAAAGAGAAAATAAGATACATAAGAGATAGATTT^ 

GGTTAAATTAACAAGAGACATTGAAAAAGATAATTGGAGTTACAACAAGTGCAGAAAAAT 

CACTCAAGAACTTTTAAAAGAAATATACTACAGATTAGAAGAGTTAAAAGAAATTGAAAA 

AGCATTAGAAGAAAATATATTAATCGATTGGGATGAAGTTGCAGAAAGAAGAAAAGAAAT 

TGCAGAAAAAACTGGAATAAGAAGTGATAGGATTTTAGAATATATAAGAGGTAAAAGAAA 
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ACCAAGTTTAAAGAACTATATAAAAATTGCCAATACCCTTGGTAAAAATATTGAAAAAAT 
CATTGATGCAATGAGAATCTTTGCTAAAAAGTATTCAAGCTATGCAGAGATTGGAAAAAT 
GCTCAATATGTGGAATTCAAGTATAAAAATTTACTTAGAGAGCAATACCCAAGAAATTGA 
^^?^^J'''^'''^''™^''''^^^^^'^'=T™CTTGTAAAAGAGATTCTTAACGATGA 
™r^™*'''''''''''''"'^^^^^^^'"^^'^'^'^^CTTAGCATCTAACGAAATTTATTGGGA 
CGAAATTGTTGAAATTGAGCAATTAAATGGTGAATTCACAATCTATGACTTACACGTTCC 
AAGATACCACAACTTTATTGGTGGGAATTTACCAACTATACTGCACAATACAACCGCCGC 
TTTATGTTTAGCAAGAGATTTATTTGGAGAAAACTGGAGAGATAACTTTTTAGA^SaS 

tgcctctgtttcaaaagatacaccaatattggttaaaatagatggaaaggtaaagI™ 
aacctttgaagaacttgataagatatactttgaaactaacgatgaaaatgagatgtaJJ^ 

GAAAGTTGATAACTTAGAGGTTTTAACTGTAGATGAAAACTTTAGAGTTAGATGGAGliJ 
GGTTTCTACAATAATTAGGCATAAAGTTGATAAGATTTTGAGAATTAAGTTTGAAGG^ 
ATATATAGAGCTAACTGGAAACCACTCAATTATGATGCTTGATGAAAATGGTTTAG??Gr 

aaagaaagcaagtgatataaaggttggggattgtttcttaagctttgtagc?aI?a??ga 
aggtgaaaaagataggttggatttaaaagagtttgaaccaaaggatattacttcIaggg? 
taagataattaatgactttgacattgatgaagacactgcatggatgcttggattctatc? 
tgctgaaggagctgtaggctttaaggggaaaacatctggacaagttatttatacaSIgg 
tagccatgagcatgatttaattaataaattaaatgatattgttgataaaaaaggaSJag 

tagaatattaaatacccaacttgcgagatttgttgaggaaaacttctatgatggtaatgg 
aagaagagcaagaaataaaagaattccagatattatatttgaattaaaagaaaatcta^? 
agttgaattcttaaaaggattggctgatggagatagtagtggaaattggagaSJIS?^? 
tagaatatcatccaaatcagataatttattaatcgatacggtatggcttgcaaga^S?? 

gtggaagaaaagcaacttactaccggctgagccaataatcaaaatgattaaaaagttaga 
gaataagataaatggaaactggagatatatattaagacatcaactctatgaaggtaaaaa 
gagagtttcaaaagataaaattaagcaaattttagaaatggtcaatgttgagS??J?? 
agataaagaaaaagaagtttatgatttattgaaaaagttatctaaaacagagtta?I?g^ 
gttggttgttaaagagattgaaattattgactacaacgactttgtttaS?gtatcIg? 

TCCAAACAATGAGATGTTCTTTGCTGGAAATGTGCCAATATTATTGCATAATTCTGATrl 

aagagggatagatgtaattagaacaaaagtaaaagattttgcaagaacaaagccaattgg 
ggatgttccatttaagattatattcttagatgagagcgatgcattaactgcagaJJSca 
gaacgctttaagaagaacaatggagaaatattcagatgtttgtagatttmcttgStc 
tctaactggagatgcaaaaataactcttccagatgagagagagataaagSgagSaS? 
I^^^^^'^^^^^^'^^^gcttaaacatgttttaaatagaaatgSgaggaSi 
agttttagcaggggttaaatttaactcaaagatagttaatcataaggtttatagattagt 
tttagaaagtggtagggagatagaggcaacaggagaccacaagtttttaacaagSgSgg 
atggaaggaagtttatgagctaaaagaggatgatgaagtattggtttatccagcJ??gga 



SSS^^^^'^^^^'^^^'^^'^^'^^^^^^g^ggataattggcttaaatgagttctacgaatt 

^rI^S'^''''''''''°''^''^™^^^«G^TATAAACCATTAGGTAAAGCAAAAAGCTj?IJ 

AT^I ^'^^'^^''^'''''''''''^^^^'^^■^'"^gtagagttttggagctctcagJJJJ 



aatattgataactgctacaataaatgaacttgaaggaatta^gaaaga???Jg^S?? 

GATSGTGAAftGCMATMTAAaMCTArCTMAATflTCTATCAASflT?™^^ 
GGS?^fJ??'''''"='''*==''"=''^'='=«=«=™TCTAAGAA?™S^^ 



=g=g%^I?§????aTaSJ=^ 



-174- 



TGAGATTGTTTATAAGGTCTCATCAAGAGCAAGACCTGAGGAAGTTAAGAAGATGATGGA 
ATTGGCTTTAGATGGAAAGTTCATGGAGGCAAGAGATTTATTGTATAAGCTTATGGTTGA 
GTGGGGAATGAGTGGGGAGGATATATTAAACCAGATGTTTAGAGAGATAAACAGTTTGGA 
TATTGATGAGAGGAAGAAGGTTGAGTTGGCAGATGCTATTGGTGAAACTGACTTTAGAAT 
AGTTGAGGGAGCTAATGAACGAATTCAATTGAGTGCTTTATTAGCAAAAATGGCGTTAAT 
GGGAAGATAATTTAACCTTCTTTTTCATGAATAATTTTATTATTTCCATTVAAnATAGACG 
TTGAAAATGCCCTCACCAAACAAATAAnCCAnTCTTTTAAATTTAAAGAGTAATTTTTTC 
TTTTCTTTAAGTTCCTTGTATCCATATATTTTTAATCTTTCTTCAACTTCAAGATTTGAT 
AAGCC7VATCATCAATATCACTGCAAAAAATGTATATGGCAATGTTTATAATTCACAACGT 
ATAAACCTTTTTTAACATCCTATCATATTATGAAAAGGTTATTTTACACATAAAAAGTAG 
GAGATGATTATGAAAAGAGTTGTGATTGCCGGAACATCAAGTGAAGTTGGAAAGACAGTT 
ATCTCTACTGGAATTATGAAGGCATTATCAAAAAAATATAACGTTCAAGGCTATAAAGTT 
GGGCCTGACTATATAGACCCAACATATCACACGATAGCCACTGGAAATAAATCAAGGAAT 
TTAGATTCTTTTTTTATGAATAAAGAACAAATAAAATATCTTTTTCA/^AACATTCAAAA 
GATAAGGATATAAGTGTTATTGAGGGAGTTAGAGGGCTTTATGAGGGAATATCTGCAATA 
GATGATATTGGAAGCACAGCAAGCGTTGCCAAGGCTTTAGATAGCCCTATAATCCTGCTT 
GTGAATGCAAAGAGCTTAACAAGAAGTGCAATAGCAATAATAAAAGGTTTTATGAGTTTT 
GATAATGTGAAAATTAAAGGAGTTATTTTCAATTTTGTTAGAAGTGAAAACCACATAAAA 
AAATTAAAAGATGC/VATGAGTTATTATCTTCCAGATATTGAAATAATTGGCTTTATCCCA 
AGGAATGAAGATTTTAAAGTTGAAGGAAGGCATCTTGGTTTAGTCCCTACTCCAGAAAAC 
TTAAAGGAGATAGAGAGTAAGATAGTGTTATGGGGGGAGTTGGTTGAAAAATATTTGGAT 
TTAGATAAGATTGTGGAGATAGCTGATGAGGATTTTGAAGAGGTTGATGATGTGTTTTTA 
TGGGAGGTTAATGAAAATTACAAAAAAATAGCTGTTGCCTATGATAAGGCATTTAATTTT 
TATTATTGGGATAACTTTGAAGCTTTAAAAGAAAATAAAGCTAAGATAGAATTTTTCAGC 
CCATTAAAAGATAGTGAAGTTCCAGATGCAGATATTTTGTATATAGGAGGAGGTTATCCA 
GAGCTGTTTAAAGAAGAATTAAGCAGAAATAAAGAGATGATTG7VAAGCATTAAAGAGTTT 
GACGGCTATATCTATGGAGAATGTGGGGGCTTGATGTATATAACAAAATCGATTGATAAT 
GTTCCAATGGTTGGTTTATTAAACTGCTCAGCTGTTATGACAAAGCACGTTCAAGGACTT 
AGCTATGTTAAAGCTGAGTTTTTAGAGGATTGTTTAATTGGAAGAAAGGGATTAAAGTTT 
AAAGGGCATGAGTTCCATTACTCAAAGCTTGTCAATATAAAAGAGGAGAGATTTGCCTAT 
AAAATAGAAAGGGGGAGAGGAATTATCAATAACTTAGATGGGATTTTTAATGGTAAAGTT 
TTGGCTGGTTATTTACACAATCATGCTGTAGCTAATCCTTATTTTGCTTCATCTATGGTT 
AATTTTGGTGAGTAAATAGAAGATAAGAATGAAAGAAAAATCTCATATGAGATTCCTGAA 
AAAATTTCCATTTTTGATTTTAGAAATTATTTCATGGATTTTTGAGTTATTTTCATTTGT 
ATTGATTATTTTTGGATTTTCTCTATCTTTAGGATTTGGAAAGGAGATATTGTTGATAAA 
AAAATAAT7UVAATATGAGGCTCATGATAGAAGTTATAAAGGAGAAAATCGTAGAGAGGAA 
GCTTTTTAAAAGGAATAGGAATCGATAGAGGTTAAAATCTTAGCAGGGCTTTTATACTAC 
CTCGGATTATCGTTAAGGAAGGTAAGTTTATTCCTCTCCCAATTCGAAGACATAAGTCAC 
GAATCGATTAGAATTTATTATCACAAGATTAAAGAGGTTTTAAATAGATTTCCAAGTAAT 
GGTAAATTCGATACGGTTGTTAGTTGAGTAAAAAGCTTCATAATGTTCTATAACTGGATG 
AAATCGCTAACTTAACAACCTCATAAGGATTTCACAGTTTATATATTAACTTTGGAGCTT 
AAGTACTAAGAATAAGAAAGGGTTATAAAAATTCATTCAATAAAATTCTAAAAACTTATT 
CAACAGTAACGCTCTTAGCTAAACCTCTTGGTTTATCAACATCTCTTCCCAATTCAACTG 
CCTTATAATAAGCTAACAACTGGAAGGCTGGAGCATAAACAATTGGAGAAATCTCTTCAA 
TTACCTCTGGAACTAATATATTTTCAGCTCCATCTATTTCAGTTGGAGTTATGGCTATAA 
CTTTTCCCCCTCTTGCTTTAACCTCTTCTATATTTGATAATATTGAGTTAAATACTGCAG 
AATCCCTTGGAGGAACTATTGCTACAGTATCCATATTTTCATCAATTAGGGAGATAGTTC 
CATGCTTTAACAGTCCCCCACTCATCCCCTCAGCATGTAAATAAGTTATTTCTTTAAATT 
TTAAGGCTCCTTCCAATGCACTTGCAATATTTATTCCTTTAGAGATGAATATGTAGTTAT 
TTACTTTTAGATTGTTGGCTATTTCTTTAATTGTTTCTTTTTTATCTAAAACCTCCTTTA 
TATAATTCGGAATTTTATCAATCTCTTTCTCATATTCACTCATATCTCTACCTAAAAGCT 
TTCCATATTCAATAAACAACCTATACAGTATCATTAACTGGGATGTGTAAGTTTTAGTAG 
CACAGACAGCTATCTCTATCCCTGCTCCCATCATAACGGTTATATCCGCCTCTCTTGTAG 
CTGTGCTTCCCAAAACATTAACTATAGCTCCAGTTTTTGCCTTATTTTTCTTAGCAAATC 
TCAATGCCTTTAAAGTATCGTAGGTTTCTCCACTTTGTGTAATCCCTATAACTAAGGTTT 
TATCATCAACAACCCCTTTATTTAAAAATTCAGATGCATCACAAGCTAT7U\CCAGCTTTC 
CAAGCTTTGCAAACAAATACTCTACAACCATTGCCGCATGTAAGGAGGTTCCCATGGCTA 
CAAAATATVACCCTATCATAATCTTTTATACATTTTGCCAATTCTTTTU^TTTCTTCAGCGG 
ATATTTTGGCAGAGACTTTTAAAACCTCTGGCTGTTCCATAATTTCTTTTAGCATGAAGT 
GAGGATAACCCATCTTTTCAGCAGAACTTATATCCCAATTGATTTCCATCATCTCTCTTT 
CAACAGTATTTCCATTATTTTCTATAGTTACTTCATATCCATTTTCTTTCTTTTTAATTA 
CAACAACATCTCCATCCTCTAATGGAATTGCTTTATTTGTGTAATCTAAAAAGGCAGTTA 
TATCACTCCCTAAAAAATAGCCGTCATCATTAATTCCCAATATTAGGGGACTTTCATTTC 
TTGCCCCAATTAATAGGTTTGGGAAATTTTTATTTATTATAACTAATGCATAAGTTCCTT 
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TTAATTTTTTAATTGCATTTTTAACGGCTTTTATGTAATTTTCTTCATTAATTTCTTTAA 

^TTTTTTTAATTCTTCTTCAATTAAGTGAGGGACAACTTCAGTATCAGTTTCTGATTTAA 

ATTTATGCCCCTTCTTCATTAATTCATCTTTTAACTCTTTGTAGTTAGAGATGATTCCAT 

TATGAACTACTGCAATCTCTTCTTTGCAGTCAGTATGGGGATGAGCGTTTTCTTTGCATA 

CATTTCCGTGTGTTGCCCATCTTGAATTATGATTAATTATTAAATTCCCAATGAAATTAT 

GATAGTCCTCAACCTCTAAATCATAGACATATTCAACATCAGATTCAACCTCTTCAATTT 

TAAATTTTGTCCAAACTATATCAGCGTCTAAAAATCTTTTTAAATACTCTGCTTTATCAT 

ATAGTCCTTTACTGTTTAGTTCTTCAATAATCTTTTCTATTGTGTAATCGGTGCAATAGT 

TATCTCCATTTTTTATTGTTTTTAATGGAACTCCTACAAATTCCCTAATCTCTTTTTTAG 

TTAAAGGAATTGAAATATATCTAAaGTTAAGTCCTTTCATTTTGTTTAATATAGCCTCTA 

ATTTCTCCATTTTGTCCTTTGCAGTAAAACCAATGTATTTTTTAAATAATTCA7VAGGATT 

TTTTATCACTTATAAGAAGCTTATGAGTATTGTTCCAATTTTCCTCTTTTCTTTTAATTT 

TTGAATATGATGCTAAAATTCCAAATCTCAACAACAAGAACTGAATCTCTTTTATAAAGC 

ATTTGGAAGTCATTCCTATACCAATTTGCTTAGCCTCAGCTCTTATATAACCTTCTGCAT 

CAAATATTCCTCTTAAATATGATGCAACCAAATCATTATTTAATCTAAATACAAATTCTG 

GAGTTCTCTCATTACCGGTTTTATTAAATAACTCTGGAATGTTTTCTCTGAACCAATCAA 

TCAGGTATTTGCTGTTTATCTCTAATATATAATAATTGCCATCTCCTTTTTTGATATTCC 

CTTCTAAGTTAAAGACAGTTTTAAATAGTTGATTATATTCCTCTAAAACTTCTTTTCTTT 

CATCCTTCAATCTCAACATCCTATTTGAAGGGAAATGCCCATCTCCAATAATATAACCAA 

TAATCTGCATTAATTCTGGAGTAGGAGTTTTTGGAAATTTAACAGGATTTGTGTAATGTA 

AGTTATCCCTATATATAATCTCTTCAAAGTTGATACCATAAAGAGAACATAATTTCTTTA 

ATCTCTCTTCTTCAATACTCTCTAATTTTCCAGTTTCAATTTTTACAATATATATTTCTT 

TAACTCCACATAATTTTTCAACGTCTTTTCTTGTTAGTCCTAATTTTTCTCTAACTTTTC 

TTAGTTTATTTCTTATTGTTTCATCTAACTTATAATGCCTTTCAACATATACATCCTTAA 

ACTCAACATTATCATTAAAGCTATAATTTAACTTCCTCACTACACCAATTAACTCACTTC 

CATTCAAATCTTTAACACACTTCTCAACTATCTTTCCATTCTCTACAACAAACAATTTAT 

GTTCTCCAGTTGTAATAAGTTCAGAAAAGGCAGTTTTTATCTTATATAATATCTTTGGTG 

CTTTATGTTTAAACTTTTTGATTTTTTTATTATATAGCTTTAAATCTTCAAAATTAACTG 

ATAAAACCTCATCTTCATCAATTTCAGAAATCTTTTTCATTCTTCCATCTGGCAATATAA 

CATAAGTATCTGGATGCAAACAATGCCCTATCCCAATATTTCCATCAATATCCAAAAATC 

TCTCTTTTTTAGCAACCTCTTCAACTTTGCCCACATTCTTTTTAATAATTAGTTTATTAT 

TATCAACAACTCCAATTCCACAGCTATCATATCCTCTATATTCCAACCTTCTTAATCCAT 

TTAATAAGATTTTTGGAGCTTTATCATTACCTATATAGCCAATGATACCACACATAAATT 

TCACCGATAAACCTAAATATCTCCTAAAGTAATAAATAGTTAAAACCCATAAACAATAAT 

ATTAATACTAATTTTAAATAATTCTCTTTTAAATATATTATATAATGCTTTGTTTGGAGG 

TGAGAGTTATGGTATTTGGTAGTGCAGCGAGTGAGAAAACACCTGAGGAAATATTAAAAG 

GAGTTGCTTTGATGTTGGATGAGATAATTAACGATACAACCGTTCCAAGAAACATTAGAG 

CTGCTGCTGAAAAGGCTAAAGAAGCTGTTTTAAAAGAGGGGGAGGAGCCAATCGTTAGAA 

GTGCAACAGCAATCCACATCTTAGATGAGATTAGCAACGACCCAAACATGCCACTTCACA 

CAAGAACACAAATTTGGAGTATTGTTAGTGAATTAGAAAGAGTTAAATAAATTTAAAAAT 

CCCCACTATTTCTTTACAAGAAGGTTTAAAAGTGTAAGTTTAGCTTGTTCTTCTAAAACC 

TCCACTTTTATATATGCATCTATCAAATCTTTACCTAAACAAACTACTCCATGATTTTTT 

AATATAATAACGTCCTCATCTCTTTTTGCTGTTTCTTCAGCTAATTTTAAACTACCTGCC 

TCATAGTAATCAACATAACCAATTTTCTTCAAAAATATTTTTCCTTCTGGTGTTAAAAGT 

TCTATTTCTTTGTTTATTGTCGATAAAAAAGTTGATATAAGTGAATGAGTGTGGATTATT 

GCGTTTATGTCATTTCTTTTTCTATAAATCATTAAGTGGAGATTTTTTTCTGATGTAGGT 

TTTCCTTTTATAACATTACCATCCAAATCCATTTCAGCTATATCATCTTCCTTTAAAAAC 

CCTAAAATAGAGCCAGTTGGAGTCAGATATATTTTATCCCCCTCTTTAACTGATACATTG 

CCTCCACTACCTACAACATATTTCCTATCATACAATTTTCTACATATTTTAATAAATTGC 

TTTTTGTCCATAATCTCACTTAAAATATTTTTATTAGTTTCAACATCAAGATTATAACCG 

TTATATAAAGAAATACAATTAAAATCCCTAAGATTATGATATTAATGCGGATTATTTTTA 

GGTTTCTTTTTAAATCCTCCATGGTTTTTATGTATATTCCTAAAGCTAACAACGCCTCTT 

GTTTTAATATTTTATTAGTTTTTTCATCTAATTTTTTCTCATTTTCTTTTAGTTTTGCAT 

CTACTCTCTTCTCTAATTCTAAAATTTTTTTACTCAATACTTCAAACCTCTCCTCTATTT 

CCTTTTCATCCATAGTAGCCCTTATTTAAATATTAAGTTAATAATTAGATAATAGTGATT 

TTAAATATTGAATAGCTATTTGTATATAATTAAAAAGCTGGGTAGATAAATATTCATTAT 

TTAATATTACCATATAGCTAATTGTAAGACCTATTGCAATTATTACTATGAGTAATAGCA 

TAGATATTTGTTTTAATTTTCTTTCCATTCTATTTTTATGTTCTATGTCTTCTTTAACAT 

ATTTCTCCATTTCTTTTAAAATTTCATTATTATTTTCAAGATATTTATCAATTTTTTGAT 

TTAGCTCTTTTAACATATTTACATTTTGCCTCATAAATTCGTTTAATTGAACATTGGAAG 

AGTTGATATTTTTATCTTTAATCTCATCAAGTTGCTCATATAATTTCTTTATTTCATCTA 

ATATTTTATTAGTTAAATCTTTCTGCCCCCCCCCTATTATCCAAATCAAATACAAGAGGA 

ACTATCTCATCCAAGGTTTTTACAGGAATAATCTCTATTCCCTCTGTTTCAATAACATCT 

ATCATGTTTGCCTCTGGAATAATAACCCTCTTAAACCCGTATCTCTTAGCTGCCTCTATC 
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TTCTCATTAACTCCTCCAATAGCTAAAACATTCCCACTTAAATCTAAGCTTCCAGTTATT 

GCAAAGTCCTGTTTTAATGGAATGTCTAATAAAGCAGATATTATAGCTAAACAAACTGCA 

GCTGTAGCACTATCCCCATCAATCTTTGAATATGACTGACTGAATTGGATATATATCTCT 

TTATTATTTAAGTCAATATCTTTCTTAGGTAGAGGAAGTTTCTTCTCAGCTACTAATTTT 

TTTGACAATGCTGAAGCTAAAGTTATTGAATGCTTTGCAATATCTCCAGAAATATTTAAT 

AGATGAGTTCCTGGGTTTTTTGATTCTAATATTTGAACAATAATCTTTGTTACATCCCCT 

ATTCCTCCAGCTCCTAATACAGCTAAGCCGTATATAACTCCAACCTTTGGTTCATCATTT 

GGCACAATATGCTTGTATCTCTTGAAGTTTTTGATGTAGTTTAATGCCACCTGTTTTTCC 

ATACTGTAAATTCCAGTATCAAATACCTTTCTCACGTGTTCTGCAGTTATATATACCTTG 

TTACTTTTATCTTTTTGAGTTTCTGGATGATATTCTCCCTTATCATCAAAATTGCCCAAT 

AATTCTTCAACATCTTTACCCATAGCTACATCATTTGCCATTTTTATAATATTTGCAAGC 

AATCTTAACCTTAAAGTTAATTTATCCTTTGAACCTGCCAAGTATTGAGCAATTCTGACA 

ACTTCACAACATCCATCGTAAGTCATTGGGTTTAAGTTGTTGTTCTTTATCTCTTGAACT 

ATAAACTGTAATAACTTATCCCTATTTTCTAGGGTGTTGTCCATTTTATTCTTTAAAACT 

ATCTTATAGTCAATCCTATCCAACAGTGGAGCTCTCAAATTATAAACATCATCCATGTTT 

CCAGACATTATTAGGATGAAGTCACATGGTATTGGGTTTGTTTCTACAGTGGCTCCACTT 

GAATTTGGATTTCTCCCACTTATTGGAAGTTGTTTATCTTGTAGAGCAGTTAAAATGTAG 

TCCTGAACTTCCAAAGGCATTGTTTTTATTTCATC7VACGTATAAAATTCCTCTGTGTGCC 

TCGTGAATAGCTCCTAATATAATCCTCTTATGTGGAGGAGTTCCTAATGGAGGTCTTCCA 

CCTAATGGACAGTGTTTTATATCCCCTAACAACCTTGTTACGTTGTAAGCACTTGCTCTT 

ACAAGAGGTCTTTTTTTACATTCATATAAGAGGACTGGCTTTAAATCCATTGGATTTAGA 

TTATTTGGCATTGAAGCCCTTGAAGCCCCCATTATGCTTGTTAAAATAATTACAAATCCA 

AAGATTAAAACTATTAGAGCAGTTATGGTTACAGCTGCCAGTAAGTAATTTTGTGGTAAi\ 

TATTTTAAAAGATATTCTGACAGTAGTATAGCACCAATCATTATTAGAAGTAAAGTTGTT 

GAGCTTGGAGCTTTAAAATCCAATTTTGGCATGTCTTTTGAGTCTTCTTTATACTCTCCA 

TCTATAACCTCAACTATTGGTCTCTCCATATTCTTTAAATTTGGTTTTGCAATTACATAA 

TAAGGAGTAAATTCACCAAAATCAGATAAAATTTCTCCAACTGCTTTAACTATCATTGAT 

TTTCCTACTCCAGGGTCTCCTAATAAAATAACATTTCTCTTATTTTTTACAGCAGACAAA 

ACAATTTTTACAGCTTCCTCTTGTCCAATAACTTGGTCAATTAACCTTGGTGATGGTTCT 

GGCAATTCCTCAGTAGTTTTAAATTTTATTGAAAACATATTAACACCTTATAAAAATCTC 

TGTAAATATATTGACATATATAAATCTTTTAAATTTTTAGTTACTATTAAAAGGAAGATG 

CCTTATCATAATATCATAATCTTATATTTATAATTTTTAGTTATGGTGATATGATGGATT 

TAGAAGGATATGTTAGAAGATGCCTAAGAAAAAAAATCCCAGAAAATAAGATTATTGAGG 

ATGGGTTTAAGAGAATTTTAGAGATTAAAGAAGATGTAGATGAGGAGTTTGC7VA7VAAAGT 

TTATAAAGGCAATTTTAGAGGAGGTAAAGACAACTGAAAAATTTAGAGAGATTGATGATG 

AGAATTTAAAAACTCTACTAAAATATCCAAAATCTGGAGTAAC/^TGGGAAGAATGGGAG 

TTGGTAGTAGAGGAGAAGGAGATTTCTTTGTTCATAGAGAAATAGCAAGGATTGTTAAAA 

GCACTAAAGTTAAAGCCTATGTTTCAGCTGAAGAGCAAGATGATGCAGGGATTGTTAGAG 

CTGATGCTAAATACATAGTTGCGGCAATAGATGGAACTCACTCMGGCTTAGTGATTTCC 

CATTTTTAGGAGGTTTTCATGTAACAAGAGCTGCTTTAAGAGATATCTATGTCATGGGAG 

CTGAGGCGGTTGCTTTAATTAGTGATGTGCATTTAGCTGATGATGGAGATGTAGGGAAGA 

TATTTGACTTCACAGCTGGAATTTGTGCTGTTTCTGAGGCTGTTAATGTTCCTTTAATAG 

GAGGAAGCACGCTGAGAGTTGGAGGGGATATGGTAATTGGAGATAGGTTGGTTAGTGCTG 

TTGGTGCAATAGGAGTTATTAAAGAGGGAGAACCAACAGCAAGAAGAAACGCTGAAGTTG 

GAGATGTTATTTTGATGACAGAGGGTAGTGGAGGAGGGACGATAACAACAACTGCCCTGT 

ATTATGGATGGTTTGACGTGATTTATGAAACTTTAAATGTGGATTTTATAAAGGCATGTC 

AGAATTTGATTAGT^GCGGTTTAATTAAAAAGATTCACGCAATGACAGATGTCACAAATG 

GCGGTTTAAGAGGAGATGCTTATGAAATTTCAAAAACAGCTAAGGTCTCTTTAATATTTG 

ATAAAGAGAAGGTTTATAAAACAATCAATCCAAAGGTTTTAGAGATGCTTGAGGTATTGA 

ATATAGACCCATTAGGAGTTTCAACAGATTCTTTAATGATTATCTGCCCTGAAGAGTATG 

CTGATGATATAAAAAAGGTTACTGGAGCTATAGAAGTTGGATATGTTGAGGAGGGAGAGG 

AGAGTTATTTAGTTGATGGAAATAAAAAAATCCCATTAAAACCAATGTTTAGAGAATCCG 

CATATACGCCAGTTAAAAAGGTTGTTGGTGAGAGAAAACCTGGAGATTTTGAGGAGATGA 

AGGAGAAAGTTAGGAGAGCATGTGATGAAGCTATTAAAAAGAAAGATTTTGTTGTTGAAT 

TGTTAAAAGAGAGAAAAMGAAATTTTAATACTTTCCTTCCAAGATTTTTACAGCAATTT 

CCTTTGCTTGCTCTCTCTTCTCCTTCAAATCTTTTAAAAACTCTACAACTCTTTCATAAG 

CCATTTTTTTGCATTTACCACAAGTTAGTTCTCCACTTCTACATTTTTGATATATTTCAG 

CTAATTCCTTATCATCTAAGATTAAGTGATATAAAAACAACTCATAAACAACACATTCCT 

CTGGAACTCCCCCATACTTCTTATGCTCTTCTAAAGTCTCTCTTCCCCCAGTTTTTGCTG 

AGAATATTTTCTTTTTAACAGTTTTTTCATCATCAGTCAAAAATATTGCTGTCTCTGGCT 

TTGAAGAACTCATTTTTCCTCCTAACAATCCAGTCATAAATCTGTGATAGGTTGAAGATG 

GAGGAATAAACTTAAACTCCTTAGCTCTATTTGCAATATCTCTTGTTAATCTTATATGCG 

GGTCTTGGTCAATTCCTACTGGAACAACAACGGGTTTTGGTTCTGGACTTAAGTTCTCAT 

CAAGTTGAGGATGTAAAATATCAGCAACTTGAACTATTGGGGCAAAGACGTGTCCAATGT 
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TTGTTTCTCCTTTAAATCCATAAATTGCCTTCATCTCACTCCAATTTGTTCTTTTAGATA 

AAATTAAAGCCAAATCTTTAACCTTTTGATATTTTGATTGTAAATACACATTAATTTTTT 

CTGGGTCTAAGCCAAGAGCTATGTAGTTGGTTATATACTCATTTAAAGCAAGTTCTTTTG 

TTGTTTCAAAGCTCATGTTTCTTGCCCAATATGCCTCTAAATCAGCTATTGGGATGTTTA 

TATTGTCAGTGTATTTTTGATAAAACTTTAATAAATCTACCACCATTTTATGCCCAAAAT 

GCATTTTACCAGAAGGCATCATTCCACTAACAACTGCAAACTCTTTGTTATTTTTTATTG 

CATCAACTATTCTCTCAAAATCCCTATGCCCCAATATAATATTCCTCCTGAAGAAATGAT 

GTTCCTCTTTCAAATCTCCTAAAACATCAACTATTGGCTTAACTCCAAACTGCTCCATCG 

TCTTTTTGTAATCAATAACTGCTGGAGTTTCCCATGGTGTTAATTCCATTAGTTTCACCC 

TTCATTTTGATATAAAGATTTAAATACATCAGTTGTAATATATTTATTATGTATTCTCTT 

TAAGGTGTCACCTATGAGGTGGGCAATATTTTTGGTTTTATTAACTATAACATTCAGCGG 

TTGTTTAAATAAAGAGATAAGTAAGGAAGAGATAATTAAAAAGATTGATGAAATTAACAC 

ATTTTCTTATAACGCAAAAGTGTTTATAAACCTTAGTGTTTCAAATCCAGCAATAAATAA 

GGTTAATATGAAAATGGATATTGACGGATACAGTGATGGAAAGTTATCAAAAGGGATTAT 

ACATGTTTATTATACAGTTAGATACTTTAATGGTAGGAATGAGACAATTCCTTTTTATGT 

AAATGAGGAAGGAACGTTTATAAAATTAGAGGGAAAATGGCAGAAAATAACAAATAATGA 

TTTAAGCAATCACACGTGGAATATATTAGCTTATATAAAAGACTTAATTGAAAAAAATGA 

CATAAAAATTGAGGAAGAAAACAATCATTATATTATAAGGTTAAAGGATGAAAATGCTGA 

AAAACAATTAAATCCTTTCTTCTACAGAGGGATAAAAATCCCAGGAATAAATCTAAAAAT 

CTCTGAAGAGGAAGTAGTTATTATATTAGATAAGTATGGAACTCCAATAAAAGTTATTAA 

AAAAGGAAAATTGTATGGAACTTCAAGTAAAGGAAACTTAGATGGAGTTATAGTTATAGA 

AACGGAGATTAAAGATATCAACAAAGATTTTGACTTCTCAATACCAGAAGATTTAAGTAT 

ATATAACTAACATAGGTTATTAACATCATTATTTAGTTTTATTAACTTATTTGTTTTTGG 

GTGGGTGGGATGATAACTACCACAACTCCTTATATTGAAGGAAAAAAGATAATCAAATAT 

TTGGGCTTTGTGCATGGGGTTGCATCAGTTTATGTTACTGTTAAGTATTATGAAGATGTT 

AAAGATGCGTATGAAAGGGCATTAAGGGAGTCGGAGGATACTGCCCTAATTAGAATGGTA 

GATAATGCAAAGAAATTAGGAGCTAATGCAATTATTGGGATTAACTCAAATTATGCAATG 

GTTGGAGAAAAAGGAGACATGATAATGGTTGGCATCTATGGAACTGCGGTTGTTGTTGAA 

GAAGATGGATAATAA7VATTAAAAAAATTAAAGAATTACTTTTTTAGCAATACATAAACAT 

CATCTCCAGTTTTTACATTTTTTAAATACTCTGCATTTTCAACTATTCCTCCAACAATAT 

TTGTTTTTTCA7VAGCTTTCTCCAGTAGGACCGAATTTATCACTCTCTCCCAATCTAACCC 

CAATCATTCCCTTATACCTACTAACCATGTTTGTTACTCCAATAGAGCAGGGTTCAACTT 

TATCCGTTGGGATGTTTTCAGGCAACAAGCCTTTAGCATACTCAGGATTCCCTTTAAACA 

TTACAATATCCTTATGTTTGAAATATACATGAAGCTTTCCAACTCTCTTTGTTGTTAATC 

CAGTAGTTTTTCTGAAATACCATGCTGTAATTGGAGCTTTATCTTCAAACAACTCTATAA 

CGATTATTTTGTTCTTATCCAATCCTCTAATTTTAACTTTCTTTTCCTTTAATACATCTA 

AGGTATATTCTGGCTCCTGnTCAACAACAATTGCATTTTCTAAATCTCCCTCTTTCTCnA 

CTTCTATATTGTATTTTTTAAACATCTCTTCAGCTTCCTCTATAGTTAGACCAATAGCAC 

ACAACCTCTCAGGAACTGnTTTTACAGACAAAACTCCAGAATCAGnAAAGTCAATAAGCT 

CTATTCCTTCCTTAACCCTTCCAACAACTGTGTGAGATAAAGATGAACTCCTACTTTCTC 

TGnAGATATAAACTTTACCTTCTCCAACTCCATAGTTTCTGACAGTTATAAATCCTCTCT 

CCCTATCAATTAAATTTCCTTCCTCAATTTTTAATGTCTGCAGTCTGCAATCAGCAACGT 

AAGTATTTGTGTTTTCAGTTATCTCAAATATTCCATCCTCCATTAAAGCTAAACAATGCT 

CTACTGCTGAAGGAGTTCCATCAAACTCAGCTGTGAAATAAGTAAAGATTCTCCAGCCAT 

CTTCTAATTTTAAATCTAAATCAGTTGTTACTAAGTAATCAACTGCCTCTTTCTCCTCTC 

TAATTGGCTCTATGTCAATAATTTTATCTCCAACCTCCAATCTATCAATAACCCACTTAC 

CTCCAACAACAATACCAATCTTTGGGTCTTTTAATCCATAAACTCCCTCAGTTTTTTTCT 

TTATGAATACAATATGCCCCTCATCTTTATCTAAACCAGATATACTCAAAACAACATCCC 

^'^TTTTTAAACTCTTTTGGTTCTGTTGAGATTTCTAAGTCAATTGTCGTTGAACCAAATG 

CTACATCCATTCCACTAACCCATCTGAGCTTCTTTACAAAGTCTTTATAATTATTTATAA 

AGAATTTTGCTGTCTCATTATTTTCAGTTATTGCTATTGTTATATTCCCCTTAGTTGTTT 

TAATTAAAAACTTCTTTGGTATTTTTTCAGCTTCTCTTTTAACTCCTTTTATTATTACGA 

TATTCGCTCCCTCATTATAGTATTCATCCTTTATAACCTCTCTTAAAGTTTCTCCAACCT 

TTTCCTTTCCATTTACAATTACCTTAGCCATAGTTTCACCAGATTATTTTGAGATTATTA 

TTGTCyTTATATTGCTCATCTCTTCCATTGCATATTTAACTCCTTCCCTACCTAAGCCAC 

TCTTCTTAACTCCTCCAAATGGCATATTATCCTGCCTAAACAATGATGAATCATTTATAA 

CTACTCCCCCAAACTCCAAGTTTTCAGCAAATTTTAAGGATTTATTTATATCGTTTGTGA 

ATATAGCTGAATGCAACCCATATTCAGTGCTGTTGGCTATATCAATCATCTCTTCCTCAT 

TAGTTCTAATTATAGGAATTACTGGGGCAAATGTTTCAGTTTTGCATAAAATATTGTCTC 

TATCAACTTCCAATATTGTTGGATAGAATAGAGCTTTATCTCTCTTTCCTCCTAATAATA 

ACTTACCTCCTTCATCTATAGCTTTTTCTACAACTTTTTCAACCCATTCTGCATGTTCAA 

CACTTATTAAAGGTCCTACATCAGTTTTCTCATCTAATGGGTTTCCTACGTTAAGTACTT 

TTGCCTTATTTACAAACATCTCTATGAACTTATCTGCTATACTCTCATCAACTAAAATCA 

TCCCTACAGAGATGCAAACCTGTCCAGCATATATAAAACTGCCTTTTATTAATGCATTAA 
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CTGCTTTATTTAAATCAGCATCTTTTAAAACGATATTTGGATTAACCCCTCCCAATTCCA 

AGGCAATTTTTTTAAAGCCAGCTTTTTTAGTAATTAATTCTCCAACCTTTGAACTGCCTG 

TGAAGGATATCATATTAACCTTCTCATTAACAACTATCTCATCTCCTACAACCTCTCCAG 

CTCCAGTTAGdAAATTATAAACTCCCAGTGGAACATTATATTTCTTCAAAGCATTTTCTA 

TGATTTTAGCCAACTCTATACAAACAAGAGGAGCTTTTGATGATGGATGATGAACTATAA 

CATTCCCAGTGGCTATAGCTGGGGCTATTTTATGAGCTGATAAATTTAGAGGGAAATTGA 

AGGGTGTTATAGCCCCAACTATTCCAACTGGTTCTCTCCTTGTAAAAATTAATCTATCAT 

CTGAAGGGATTACCTCATCTCTATGCTCTTTAACATAGAAAGCAGCTAATTTAAATGTTC 

CAATACTTCTTTCAACCTCTACTCTTGCCTGTTTTATTGGTTTTCCTGCATCTATAGCCA 

ATATTTTGGCAAGTTCTTCCTTCTTTTCTTTAATTTGTTTGGCAATATTCATTAAGATGT 

TGTATCTTTTAGTTATGGGGAGATTTTTCATAACTTCTTTATACTTTTCAGCCGTATCTA 

TAGCTTCTTTAGCTTCTTCCCTACTTAACGCAGGGATTTTTTTAATAACTTCTAATGAAT 

ATGGGTTAATAACATCCATATCTTCCCTATTTATCCACTTCCCATCTATGAACATGATTC 

CACCAAATAAAAAGAATGTTGTAACTAATTTATAATTTGTGCCTCTTTCATTTTAAATGT 

TTTTAACAAACATTTAATGTTTATATATTATGTGTGCTTATAATTATTAAGATTTTTAGG 

ATTTTTAATTTTGTTGTTTGGTTGATGGATTGTCTTGTTGAATATGTTTGAAATTTGAAA 

ATAAGAGCATTTAGAATTTATTAATTAGTTCAAAGGATTTTTATTTAATTTCTAAGGGTT 

TGTTAATTTGATTATTTAGATTAAAATCAGACCGATTCGGAATGGAAACTTTTATAAATC 

CAATATTGTCTGTTATTAGAATTAAACCTCTCAAAGGGTTCAAAACAAGTGGAAATCTCT 

TTTTATTAAAATGTTGATATTGTAAATCTAAATATTTATATTAGTTTATTTTTTAATAGA 

GCTTTCACAATTTATATATTAAATAATACATATAGATGCTAAGGAAATTAACTTCTCCTT 

TAGTGAAACTATGAAAGTAAGAAAATTAAAAAATGCTGAAATTAATTTAATTGAGGAAGA 

GTTAAGTAAATATACTGATAAGGATTTTGTCAAAAGCTTTAAATATGaAAATCTAATAGT 

TTTGGAAGGAAAATGGCTAACAGTTTGTTATACAAATATAGAAACAATAAAAAAATTAAA 

TATGTTTCAAGACATATTTTCAGTAGGTAATGTATTTGGTGAGATTAAGAGAAAATTTCG 

CTTATCCTTAGAGGGCTTTACATTAATATCTCCCAATATAATAAATAATTATGCAATTGT 

AAATGAAAAAGCTGAAGCATTATTTTTATATGGAAGGGATATCTTTAAAGAATCAATAAT 

AGAAGTTAAAGGTTTTGGAAGAATTGCTGTTTTTAATAAAAATAGAGAGTTTTTAGGTAT 

TGGACTCTTTGACGGAAAGATAATTAAGAATATA7VAAGATAAGGGATGGTATTTGAGAGA 

GGGTGGATAATAATTATCAAGAGTAACTACATAATACTAAAATTTATATTTATCCAAAAT 

TAAATTTTACTATTAATTATAATCTGAATTTTTAATAGGTGGAAACAATGAAAGCAAAAG 

AATTAGCTCAAAAAATTTTATTAGATATTTACAGAAACTTAGATGAATTTTCAAAGGATA 

TAATTAGAGGAGATTTAGCAGATATTGAATTTAAAGGATTCTATCTAAAAGGAAAAAACG 

GAGAGAAGGCATATATTAGAAACTTAGATGACTTTGAAAATTTAAAAGATTTTGATGTAG 

AGATGAGAAAATACAAATTAAAAAGTATAAATTTAAAGAACTTAGATGAGGGTTTAATGA 

TAATTAACTTATCTTCAAGGGTAAGTAAGGAATAT7U\GTTTGAAGCAAATGAATACTCAA 

TAATCTACCCATCAAATAATACAACCATAGAGTTTAAAGAGAGAGTATTAAAATGGATGG 

AGTTAGAAGATGATGAATTAGATGAAAAAATTATAGAGTTTGACACAAAGATGAACGAGA 

TTCTTGAAGAGCTGTTGGAAGATGTTGAAGTTGAAGAAGAAATTTCTGTCTATATTGATG 

TATTTATGGATGTGAATAAAATAGAAAATTTTGTAGAAAAAGATGACGAAAGAATAATAA 

TCTGGATTCATCCTGTCTTTTTATTCTCAAACGATGATGTCTTAAGAGGACTTTTAGCTT 

ATGAACTTTCAAGATTCAAAAGCAGATTCTTAGAAGTAGGTTATAAAGATATAATAAAAT 

ACTGCAGAGAATTAAAAAAACTAACCAACAAAAAACCAAAAGTTCTTGAAAAAATTAAAG 

ATATTGCCAATAAATATGGAGATATAGACTCTTTAAACTTAATAAATGAAATTGAGAATG 

AATAATTTAACATTCCAATTCTTTATTTTCCGCATCTTTATCCTTTAAAAATTCTTTTAT 

AGCAATTTTTTTAAGCCTTTCTTTTAACTCATCCAATCCAATATCTTTATCAGCAGAGAT 

TTTTAATATTTCTTCTATTCCAACCTCTTTTAATTTTTCTTCAATCTCTTTAACTCTCTC 

CTCATCTACCAAATCAATTTTATTTATAGCCACAACAATAGGAGCTTTT^CAAATCTTT 

TATCTCTTTTAATAGATTTATTTGCTCTTCTATTGTATAACCACAAAATTCACTGGCATC 

TATTATAAATAAAATCAAATTAGCTAAATAATTTAGAGCTAAAATTGCCTGTAACTCAAT 

ATCATTCCTCTCATACAGAGGCCTATCCAACAGTCCAGGAGTATCGACCATCTGAATCTC 

TCCTATATAACCAACATTTATTCCCTTAGTTGTGAAGGGATAGCTGTTTATTTCAACATC 

AGCTCCAGTGAGTTTTTTCAATAGTGTTGATTTACCAACGTTTGGATAACCAGCTATAAC 

TACTGTTGGCAAATCCTTAAATGTTGGTAAATCTTTTAATTTCTCTCTTGCCACTGCAAC 

AAATGCCATCTCTGGATGAATCTGCTCCAATATAGATTTAACTCTACCAACAAATTCCTT 

TCTTAACTTTCCTGCCTGTTGTGGAGTTCTTGCCTTTCTAATTTTTCTTGCATATTCATT 

TCCTAATTTTCTGACCAATTCAGAAGCCCATTTAAATGCTCCCATCGACTTTTTAAAATC 

ATCTATCCCTACCAAGACCTCAACCATCTCCTGATAAAACTTAGGAAGTTTTCTTACTGG 

AGGCGTTTTATCTATAACCTTTTGTAAGTTATCTGCAACAACTGAAGCAATAGTTCTTAC 

CTTATGCTCCTCCACAAACCTCGCTTTTAGTAACCAAGGTAATTCTTTCTGTCTCATCTC 

ATTTGCTACTTTTTCTCCTCTTCTTAAGGCTTTAGCCATCAATTCATCAGGCATCAATAT 

TGTTGGCATTTTTTTGAATGGATTAGCTTCTCTACTCATAATTATCACCAAAAAAGTTTT 

TAATAGATTTATCGATAAAATAAAATTAAAATAAATCTTCTAATAAAAGAATGATTTTTA 

TTTTCATTTATTTAAAATTTTCACCACTCCTGTATTTCCATCAACAACAATCCTATCTCC 
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AGTTTTTATCTCCTCAATATCTATTTTATCAACTAAAGGAATTCCtCCTAAAATAGCCCC 

AGTGGCAACTATTGGCTCACATTCTTTATTAACTATACCCTTTAAAATCCCTCTCTTTGC 

TAAACCATAAATAACATAGGAGCCAACAGTACTCCCCCTACCATAAGGAAATACAAAGAT 

TTTTCCTTTCAATGATTGTCCATATAAATCGCTATCTTTATCTATAATGTTGCCCTCTTC 

ATCAACTCCTCCTAAAAAAGAGAATGGTTTTTTAGAAACGATTGCTATGCCTTCAATAAT 

ACCTTTTGATATACTTCTTCCTTTTAGTTCCATAATAATCCCTCAATCATAAGAAAATTT 

ACACCTCCGAGCGTTAGnAAGGGGAGTGTTAAGAGGTATCCTCACTATAGAAGGGCTTTG 

CCCCTCTATTGGGATACTCCCCAGATAGAAAGTGGGGTTGCCTCTGGCAACCCCGCTCTG 

GAGTATAGCAATAGAGGCTTTGCCTCTATGCTTTGAAATACTTCTTCCTTTTAATTCCAT 

AATACCACTTTTAAATTATTCATTCAAAAATTACTTCATTTAAAAACTTCTCACAATAAT 

AGCATCTAATTTTTAATGGGTTTTTACTTTCTATTTTAAATTTTCCTCTAACTTTTTCTT 

TATTTGTTATGCAATTTGGATTTGTGCATTTTAATGTCCCTTCTATCTCATCTGGTU^TTT 

GTGGTTTAAGTTTTTCAACAACTTTTCCGTTTCTAATGATGTTGATAGTTACATCTGGAG 

AAATTAAAGATATTTTATCAACATCCTCCTTTTTTAATTCAATTCCTTCAATTTTTAi^AA 

TATCTTTCTTTCCTTTCTTTTTTGATGGGACATTAATGGCTATCATCACAGATGTCTCTT 

TTGGGACATTTAAAACCTTAAAAACCATTAATGCCTTTCCAGCATCTATATGGTCAATTA 

CAGTCCCATTTGTAATTTTTTTAACTTTTAACTCCTCCATAGGAATCACTTAAAATTTAA 

AGCTTTTGTTCAATTTTCTCTAATATTTTTAGCATTTTTATTTGTGTTTCTTTTATCTCA 

TTCAATACCTCTAATTTTTTATTTACATTTTCTAACTCTCCAAAAATTAGTTTTATCCTT 

GAATAATTTCTTTTTTCTTTCACTTGGAAAATTTGCCAAAATTCGAACAGTCCCATCTTC 

ATAGTTATACACAATTCCATCAATTCCTAAGGCATGTCCCAAGTTTTCAATCCTTTCTCT 

AAATCCAACATGCTGGATTCTACCGTA7VATAATAATTTCATAAGTTGTAGGCATAAACTT 

CACCAAAAATAATAATAAAATAATAAAAAATATATTATTAAAATTAAAAGGAATTATTCC 

TCTTTCTTAAACTCTTCTGGATTAAATTCTTCAGGGAACTGTCCTTTAACTACTTTACCA 

GCATGTTTTGGATGATATTTATCATAATCAGCTTCTAACTGCCTCAATACTGATTCTAAT 

GACTCTTTTAAAACATCAATAAACTCCCATCTTTCATGTTCTGAAACATAAGCTATATCT 

TCACCTTCAGCCTCTAATTTGTTTGGTAATGATGCTTTTGGCCTACCCTCCCCAACATAT 

AATTTATTTGGTGTCTTAACAT/y^GTTGTTATTCTATAATAAGGAACTCCTCCTTTATCT 

CTCTCTTTTTTGATTGTTATTTTTATCCAATGAATCTTTCCTGCATGTTTAACCATCTTT 

TTAACTTCTGTTGCAATAATTCTTTCAGCAAGGTCTTTAAATTCTGGGTCTAACATACCG 

TGTAATTCAATTTCAATCATTGCCCCTTTCTTTAAATCAGCAATATATTTAATAATATCC 

AATCTTGTGACAATTCCCCTCAATGATTTTCCTTTAACAACTGGGACTCCTCTAATATCA 

TATTCTTGCATAACTCTTGCAGCATCTGCAGCACTTGCATCAACATCGACTGTTATTAAT 

GGAGTGTTCATAATTAATCTAACTGGCTGTCCCATTCTTGGAACTTTTTCTCCTTTAAAT 

TCTCCAGCAGTCATCTTTTTCTTAGGTTTAAAGACCTTTTTTAATATATCGACTTCAGTA 

ACCATTCCAACTGGGTTTCCTTCATCATCTACAACAACCAATCTACCGATGTTATTGTCT 

CTCATCAT^GCTCTCGCTTTACCAATTGAGTCATTTTCGTTTATTGTAATAACATTCCTT 

GTCATTATCTTTGTAACCTTTGTATCTTTCATTATTTTTGATTTTGCAGCTCTTGCCATT 

ATATCATAGTCAGTTATAATTCCTACCATTTTTCCTACATTATTAACTATTGGAGCTGCT 

CTCTGCCCACTATCCAACATCTCACATACAGCATCCAAAAATGGAGTATCTTCATGTACG 

CAGTGTGCTTTATACATTAACGACCTAACTTCTTCATCTGTTGATGATGCCAACAACAAA 

TCTCTCATGCTTATTAAGTAATATTCTTCCTTACCGTCTTTTTTATCAACAACTATTAAA 

TGATGAAATCCGTTCTCTTCCATAATTCCTAATGCCTTTGAGACAGGTGTATCAGGTGTT 

ACTGTAACTACATCTTTTGTCATTATCTCTTTTACTGGTTCGTTTAACATTAATCTCACC 

TTTTAACAGAAATTTTGATGATTGATATTATATCACATTTAATATTTAAAGTTTAACAGC 

AGTTACGTATTAAGAAATGTGAATGTTTAATTTGGCTCTATTTTCCTAACAATATTAAAG 

AAAACTTTAAAATATTTTAAAATATCTTTTATCATTATTTGTTTAAAAATTAATATTTGA 

GTAATAATATTTTTTAAAAAATCGTAAGATTTATATATTATTTTGGCATGATATACTGCA 

TAAGTAATATAATATACATAATAACCTACACAAATTAATAAAAAATATATAAAATTATTA 

CTCCCGATGACGGTCTCCCATGGTGGGATACTTGAGGGAAGTAGTAGAGGTGGGAAGATG 

ATGGACTGGTTAAAGAATAAAAAAGCAATCTCTCCAATCTTAGCCTTATTAATCGTGTTA 

GGAGTTACAATAGTCGTAGGAGCAGTATTCTACGCATGGGGAAGTAATTTATTCGGAAAC 

AGCCAAGAAAAGACACAGGCAGCAGTTGAAGGGACTGCAACAAATATGTTTTATGATGCT 

GGGGCAATTAGGGTTGCAGCAACATGTATTGACAAAATAAGATACCAAGATGCTGATGAT 

AGTGATTCATGGTTAGGCTATCCAAATGGTAACGGAAAAATTGCAAAGCCATCTACTTCA 

AATGGATGTTATAATTCAACATACGGCACAGTATTCTATGACGAAAGATTTATTGTGGAA 

ATTCCAGTAACTATTGACACACAAGATTATAAATTAACTGGAGTTAAAGTTGTAGGAGGA 

ATCCCAAAAATAGTTGATATGGGTGGAACTTACACAAATGCCTTTGAAGATATATCTGCA 

AAGTTCTATGCATTCTGGTTACACCTAAATGACAACTATCAATTGTTGAAGAAAGATGGA 

ACATTGTTCGTTGGATACGTAAATAAATCAGGAATGTTTGAAGTTAGTAATGGGTATGTA 

ATTGCATGGAATCAGACAAGAGACACCTATGGAAAATTGGCTTCTTCAGTTGGTGCAACA 

TCAGACTCCAGCTGGGATGCAGTTAATACAACAACTGGAGTAGCTCCACTTGTAGAGACT 

J^^I^^^^^^^^^^'^^^^^^^^^^^^^^'^^^^^'^^GTTATACACAGCTACTGGAGAA 

GAATTAAAACCAGGATTTGGAAGTGGAACATTGGTTGCACAATGGTTCTGCAGTTCTGCA 
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ACATACTTAGATAAACTATTCAACAACCCAGAATATGTTGTAGGAACATTACCAAAGAAC 

TCAGAAAAAACTGTAAAAACCTACTTATTCTTCAATACATTATACTTGCCAAACTACAAA 

GGATCTACAAATGATGGATATGTGACATTTGAAGTTCCGTTAAAAGTTGTATCTAACGAA 

GGAGTAACTAAAGAAGTTAAAGTTAAATTTACGGTCTATGATGATGAGTAGATTTTCTAA 

TTTTTTCTTTTTTATTTATATACATCCATAAATTCAAATTTAATATAAAAATCTCTTTTC 

AAATTCTGAAAAAAATAAAATTTATGAAAAATAAATATTTTTTCATATAAAAATCATACT 

GTTAGTTAAATTTTCATATTTAAAATTTTCTTTATGAATTATATAGAAAATTTTATATAG 

ACGAAATTCGTATATACATAACCACCTACATTAAAGAAAGTTTAATAATATATAAAGTAA 

TATTTAATTCCAATGACGGTCTCCTGTAATAGGATACCCGAGGAAAGAACGAGGTGATGT 

AATATGTTTGAATGGATGAAGAACAAAAAAGCAATCTCTCCAATCTTAGCCTTATT/^TC 

GTGTTAGGAGTTACAATAGTCGTAGGAGCAGTATTCTACGCATGGGGAAGTGGATTATTT 

AATAACAGCCAGCAAAGTACTCAGTCAGCATTAGAAGGAACAACATCCACAATAACCTAC 

GCTGCAGGAGCCATAGGTGTTGGAGTTCCAAAAGAAATTGATGTTGAAGGAGATTTGGAT 

TTAACATATCCTACTCCAGATTACAAACTCTCTCACTTGACTACAACAGATTATGGCTCA 

TATGATGAAAGATTAATCGTTCCAGTTCCATTAACTTTAGAAAACTACTATGATTCGACA 

TTAACAAATGTAAAAATAGAAAGTGACGGAGCCACAGAAGTTGCTGGTTTAACACTCAAA 

AAGATTACATTAAACTACAACGGACAAAACTATGATGCATATTTATTATGCACAAATGAT 

GGGACTCCATTTAAAGGTATATTAAATAGAACTGGAATATACCCAGATGCTACATGGACT 

GGAGATGATGGAAACAACTATACAAGTGTATACTACATATTAGCTCCAAACTCAGTTACT 

GGAGTTGCAGCAGTAGATGGTAGTAAAGATTTATCAGTTACAACTGCTAAGAAATGGCCA 

TATTCACAAAATGATGTCCAAAGTATGAGGTTGTATGCAGGAGGATTCAACAATATGTGG 

TATGCATGTGCGGTTAATGGTTCATATTCAAGCTGGACAAATACATTAACAGCTACAAAA 

TTCATTGGATGGAACACTGCTCAAGCATTTTACAAATACAAAACACCAATCGATGCTAAG 

TTCTATACTTCAGAATGGGATGTTGGAACATTACATAAAGGAGAAAAAGTTTCAAAAGAA 

ATATTCTTCTTCTTTGGTTCAAGTATGGGTTTCCAAGAAGAGCCAAGTGGAGAAACAACT 

GTTAAAATCCCTGTAAAAGTTGTTTCCGACCAAGGAGTATATAAACAAGTTGATGTCAAT 

ATTGTATTAAAAGATAGGTTATAAATTCTCACCTTTTTTCTTTCTTTTTTTTATCTAAAT 

AGTTTTAAAGATTTAAATATAAAATAACCTTTAACTGCTAATATGTAATAAAATATATGC 

AATAAAATATTTCTTTTTGGATTAAAAAATAATTAGAATTTCAAAACAACCTTAAAATTA 

TATTACTTTTTCTAAAGGTGATAGAAGAGATTGTCAAGTTAAGTTTTTATTAAATATTGA 

TAAAAAATAATAAAATATGAGGCTCATGATAGAAGTTATAAAGGAGAGAATCGTAGAGAG 

GAAGCTTTTTAAAAGGAATAGGAATCGATAGAGGTTA7VAATCTTAGCAGGGCTTTTATAC 

TACCTCGGATTATCATTGAGGAAAGTGGTTTATCCCCCTCCAAATTCGAAGATATAAGTC 

ACGAATCGATTAGAATTTATTATCATAAGATTAAAGAGGTTTTAAATAGATTTCCAAGTA 

ATGGTAAATTCGATACGGTTGTTAGTTGAGTAAAAAGCTTCATAATGTTCTATAATTGGG 

TGAAATCGCTGATTTAACAACCTCTAAATCAAAAATAAAAGCTAAATAAACATTTTTATT 

AAAAAATAAAAAATTTTTATATGTGTTATTATAAAATCATGTATCCAAATATTATCTATT 

TTGGATTTTCAATTTTCTCTTTTAGTTCTTTAACTTTCTCGGCATATCTCTTGTTAAATA 

TCCTCGCTATTCTCTCTGTAGCGTCTAAAAGATTTAATAACTGCTCTAAGTAATAGTAAA 

TATCTCCAGAGTACGTTTGTATTTTAAACTCTTCATATAATGTCTTTGATATCTGTCCAG 

GAGTTTTTCCAGAAATCCTTAGATTAATAATCATCTCTAAAATTTTTTCTTCAACTTCAA 

CTCCCTCAAATTCCATAATAATTAAAGTTAAATCCTCCTTTAACTTCTTATCCTTAATTT 

TTTCCATCCCCTCCCTAATAACTTCCAAAGCATCAAAAAACCTTGAAGGAACATTTATAT 

TCAAAATTTTTGAAAGCTTTATTTTTAAATTGTTTGATAAATAGACGTTTTCAAAGGGCA 

TAATTTCAGTAATTAGTTTTATAATCTCTTTATTCTCAATAATCCCCTCTTTAATTTTTT 

CAGCAACCTTTGGATATAAAAATGAAATAGCAACTGCACTTCCATAATTTGTTAATTTCA 

CATCATTATTAGCTTTTATCATTCCATAACTCTCTAAATTACTCAAAATTTTGTTCAAAG 

AAAATGCCCTTCCAATATAAGGAACTCTATCTATATCGTATCTGTTAGTTATTCCAGCTG 

AGATTGTAGCTAATATCTGCTCTTCCTCCTCATCCTCATTATATTCAACTTTAACATCTT 

CAGGAACTGCGTTTAATAATTTAAATGCCACTTCATCCTCAGTATTTTCCATCTTTGCAT 

GATATTTCTTCCCTATTTCCACCAAAAGATAGACCTTCCCAATTTCATGCATCCCCTTTC 

TTCCAGCCCTCCCACACATTTGCTGGAATTCAGCAGGATTTAACCAATCAGCCCCCATAG 

CTAAACTCTCTAAGATAACAGTTGATGCTGGAAAATCAACCCCTGCAGATAAAGCGGCAG 

TTGTAACAACGCACTGAATTTTTTGATTTGCGAAATCATCTTCAACTTTCCTTCTTTTTA 

TATATTCCATACCTCCATGATAGAACTCTGCTTTAATTCCTTTAGATTTTAAAGCTTTAG 

CTAAATACTCTGCTCTCTTTCTTGAGTAAGTAAATATTAAGCACTGCCCTCTATATCCAA 

ATTTTGAAATGTTCTGCCATTCTCTTTTAACAATTTCTTTGATAATATTTAGTTTGGCAA 

AGTCATTTTTGCAGAAAATTATATGTCTCTCTAAAGGAACTGGCCTTCCATTATATAAAA 

CTAATTTGGCATTTAGTTGTTTAGCCAATTCCTTTGGATTTCCAATTGTTGCTGATAAAT 

ATATTTTTTGTGCCTCTTTAAATAAAAACCTCAGCCTACCAATTAAACCATCCAATCTTG 

CTCCTCTCTCCTCTAAATTCAAAGAGTGGATTTCATCAATAACCACTGTTCCAATATCTT 

TTAATCTTTTAGTTCTAATTAAATAATCAATTCCTTCGTAAGTCCCAACGATAATATCAG 

CATCTAACGATGTCTCAACATCAACTTTCTTTCCAATCCTACCTAATCCAACTCTTAAAC 

TAACTTTAAAACCTAATTTTTCATATCTTTCTTTAAATTCTAAGTATTTTTGATTTGCTA 



wo 98/07830 



-181- 



AGGCAACTTiAAGGAACTAAAAATAGAAACTTTTTTCCAGTTTTAATTAAATTTTTAATTC 

CTGCTAACTCACCAATTAAAGTTTTTCCAGATGAAGTTGCTGAAATAATTAATAAATCAT 

CTCCGTTTAACAAACCAGCTTTAACGGATAGTGTTTGAACAGGCAAAAGCTCTTCAATCC 

CCCTACTCTTTATTATCTCTTTAAGTTCTTCTGGAATATCTAACTCATCTATCTTATAAT 

TTTCAATCTTATCTTCTTCACTACCTGTTATAATATCATATCTTGTTAATTCCGGCTTAT 

CCAATGGATTTCTTATTCTCAACAAAGATAGAACTTTATCAACATCTTTAAATCTCTTTA 

AAAATTTCTCTATAAATTCTTCGCTAATTTTTACTTCCTCTTTGATTTCATTAATCCCGC 

AGTTTATACATATTTCTAAATTTCCATATCTACATCTATTATTTCTTGTCAATCTTTTrT 

AGATATTTTTTAATAAACAGAATGGGCAGAGTTCTATATAATCAAACTTTAAATTGTATG 

ATTTTAAAACTTCTTCTATTTCTTCCTCATTTTCTTTTAATATAAATATTTTGTCAGATT 

TTAACAACTCCAAAACCTTAGACGGCTGAATTAACTTATCTCCTACTCTACATCTGTATA 

ATTTGTATTTATCTCCAACTTTTTTGTAATTTGCAAATATCTTCTGATTATTTTTAACTT 

CTATCCCATCTTCTATTTTTCCACCTACTTTAACAATTTCTATCTCATCTTTTTTCTTTT 

TTGGCTTCCTAACAATAAGCATTATAAATCACCGTTCAAATAAACATTCAAATAAAGCTA 

TGAATAAAATAAGTTATAAATAATTATAACTTTATAATAACTTATTAGTAAATTTAGTAA 

AACTTTTTTGGGGATATTATGAAATTTATAATGAAATTTATAAAATCCAATAAAGGACAA 

ATTTCTTTAGAATTTTCTTTGTTAGTTATGGTTGTTGTTCTCTCAGCAATAATTGTTTCA 

TACTATTTGATAAAGACAGCTATCGAAACAAGAAATGCAAATATGGATGTTATAAATCAA 

AGTTCCAATGTTGCTGAAAAATCCTTAAGCAATGTAACGTAGTGTGAAACCATGCTGTTG 

ATAGGTATTACAGGAATGCCAGGAGCGGGAAAAAGCTCAGCTTATGAAATTGCTAAAAAA 

ttI?^ISI^^''''''''''^^^^''''''''^^^''''^^^'^t=tt^^^tatgaaacaaaaaaaagaggc 

TTAGAATTAACTCCAGAAAATGTTGGAAATACAGCTATAAAGCTAAGAGAGGAGTTTGGA 
AATGAGGCAATTGCAGTTGCATGTCTAAAATATATAGAAGAAAATTTAAAAGATAAAGAA 
ATAGTTATTGTTGAAGGTATAAGGAGCTTATATGAGGTTAATTATTTTAGAAAACATAAA 

TT^^™S^^^^'^^°'''^'^^^^^^^^^^^^^^^'^^^fGTAGAGAGGGACTTGAGGGAG 
TTAGGATTTAGTATTGGACATGCTATTGCATTGGCTGATTTTGTAGTAGTTAATGAAAAA 
AGCTTTGAAGATTGTTTAAATCAATTAGACAACATTTTACAGGAAATTTTAAATAACTTG 
GAAAAATATAAGAAATATAACTTTATTTATGAAACTTTAAGATAGATTCATTTATATAAT 
ATACTATTCTGTTTCCCTCTTCTTTAGATTTTACAATTCCAAGATTTTCCAGTTTTTTTA 
AATTATACCTTACAGTTTCAACATTCAAATTCAAATCTTTAGCAATCTTTCTTAAATGAr 
CAGGACTTTTTAATAAATATTCAAATATGCTTTTTTGCGTTTCATTTTTTAAATATAACA 
Tl^^^^''^'^''^''''''''^^^^^^^^'^<^^'f=°^TAGTAAATTAATCGATTACCAAATTTTT 
^^S^r^'^^^^'^^'^TGCTTTTTCTAATATTCTTAAATGCCACGTAAGTGTTGATACTG 
GTTTATTTAGGTTTTTAGAAAGTTCTCTTAAATGACATCCAGGATTGTCTAAAATATAAT 

^^^^^'''■'''''■''''^^^^^^'^^tt'^'^^^g^ctttttcttcatcaagaagatttaS? 

T^^^^^2^''^'^'"''''''^''^^^''=^'^^^T^'=T^C'^GATATTAGCTCCAAAAACTTCT^^^ 

ttagggtttcttgaaaagttagcattaaagatgcaagagttaagaaaacagtaaaaagta 
tatagggcagaggatttgatgttttcttaactatatgtatataaataagattaJJ?JSJ 
catcctcatctggcttaaaaccaacatatattttattatttatctttatcgccgS^ 
ttccgtctttatatcttgctataacctctcctttttctggtaattcatataatgcctcm 
cctctgattgttttaatttgagtgggtcttttatccagataattgtcttattgccatta^ 
^J^^^^'^^^^ccatttattgaagtaaaaataatagccatcttgtttgtatScatt?? 
tctttgtaatgataaggttataacttttattctctattttaaatcttagttttttaaS 
aaggagctatgtctctatcaacttttatatatggaggaataatcacgaattcttttttct 

ggtattgactaacattttctaagctaacattaataccataaatatgtacataaataccat 
aaacatgaacgatgcaaaataagattaaaattataagtaataaagtgtcctttt??aSI 

in^^?I^'''^''''''^''''™^=''^^^^°^T^C^TG^TTAAATAGAGCrcTAAC^^^ 
AAAATTAACGAAATAAACAATAATAACAATCCAAAGCTTATGTCTATATTAAAAACArra 

acaaatataaacactgaaatttcagaaattattatggatgttgaS??II^J^aacca 
atgagcgttcttctcctttgcatttttctaattttcttcaattt?attaatJ?JSS 
ttgttttcttctaccacatacttcacctctatttaaaaaagatatgcttcSgtcSI?^ 
tcaactgccacatcatacgcctttttatcacttaaaaatactctacctcc??aS?SS 
cccagtataatggcttttgctgaatttagaggaatatcataaatttca?g?S§?c?JS 

rAr^^^I^^o^''''''''''^^^^'^^^^^'^^^T'^'^<^CC^CTTAGTAGTTCAAGGTCTT^ 

GAGAAAATCCCTTTATCAACAATATATGAACGTCTCCTAAATATTGGAGGAATCAArATA 

GCAGTTATTGAAGTAGTTGTAACTGTGGCTATAGCAGCAGCTASm5TjiI???SS 

^S^r''''''''^^°''°'^^^^''5E'^''^CTTTCTCTTGTTTATTCATCTCATTTT?IJMGS 

TTTTTTGTCTCCTGAGTCTGlfeACTTCTCTTCTTTTTTATGTCTCTGCATTTGG???AS 

gttttagatgactcttcgttttccttttcttcttttttcttttttatttc?™Sata 
tagttataatatggaatgttaaaacttaaaccttcaacttttcSmJccI^JIJ™^ 
gaattatcaaccttaattgctggttttatttttatagtttcataaaS?§??g??I™ 
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TAAAATTTAAGATAAAAAATAGGAACATAATCTGAAGTGATATTAAATGACTGTGATACA 
TTCTCATTAGGTTCTATAATATAGGAATAATTACCAAAATCTATTTTTGTATTATTAACA 
ACAGCCCAATAGCTTGCATTTACATTTAAAGAGACATTCTTATCATTTTTTATTATATAA 
ATAGCTGTCCATTCAATAGAACCATTACTTAAAATTTTATTTGACTTTATTATGGAAAAT 
GGTATTGGCAGAGAATATCTTATCATCAATTTAGCGGGTACTTCAAAACTTATATTGCTA 
TCTCCGAAATATAAAATATTTGAATTTTGTATTTTTGATAAAGTTATTATTTTATAGGAC 
TTGGCAGGAACTAAGACTGATGCATTTTTAATGTTAAATCCTCCTGGAAAAGATATGTTA 
AATAAGATACTGTAGGGATAGTTATTGATTATTTTATAATATACCCTCCATTCATCTCTG 
CTGATTCTACTTGCAGATATAGTTATGGATATGTTAAATTTTTTTGGGGGGAGATTAAGA 
GTTATATTTTTTGAATATGGAAGATAAAAGGAGGAGTTTCTATAAGATATATAGATGCTA 
AAAGCATCGGGATTTTCATAATCTATATGTAACTTAGATACATCAACTGTTTGAGAACCA 
TTTGGATAAAATACAACCCCCCTATACTCATATATTTCCCCAAAACTTAAAGAAATAAAT 
GCAAAAAATAGTATAATAATAATAATTAAATATACCGATCTCATGGTCCCCAATTATTAG 
GTTTTATTTTTTAAAGTATTTAAGATTTATAGAACAATTTTTTGACGAATTATTTTTTTG 
GGTCAATACCGATAATAAGTCCCTCAACTGTTGTTATATTGGTTGTTATTGATATGTTCA 
TTGTCTCATTTGCATTTATCACATTAAATTTAAACCAGTAGGTGTTTCCATAGGTTCCAT 
TTTCATCATAATCCCCTGAAATATTGATTACTTCAATATTATCTGGCTTATACCAATAAA 
CATAAACATCTCTTGTTGTATTATATGATTTTATTGTTATGTTGTAGCCGTTTGGAAGCA 
TTACTATAGTTCCAGTTATGTTGAAGAACTTTTTAATTAAGAGATAGTTTCTTCCAAATT 
CTGATGTTCCAATACCTTTTATCGTTGCTGTTGCAGTTAGTAAAGATTCATCTCCAACCC 
CTTTTCCAGA7VACATTTATTGTTCCACTGAAAAATCCATTACTATCAGCAATCAAACTTC 
CCAAATAAATCCAACTTTCTCCATAGGTATCATTCAATACAGTGCCATCTGAAGAAATAT 
TATTTCCTATTAAATTATCTCCGCCACTTAAATTTTTAACTAAATAAATCTCAACTACCG 
CATTTGCAAAGTTTGAACTGCCAGTTCCATTTCCAATGTATCCTTTAACTGTTAAGTTAT 
CTCCGTTTAACTCAGCATAGGTAATTATTGGATAGTCAATACCATGATTTGCTTCATTAT 
AGTTCAATAGCCCATCATTTAGAGTTACATTATCATCATCTAAGTCAATTCCGAGTAGAG 
AGTTGTTGTAAATTGAGTTTTTGGAGATTATTATATTGTAAGGAACAAAATCCCAGTTTG 
GAATTGTAATTCCCTTTTCATTGTTGAAGATTGTGTTATTAATTATGTTGATGTCTTTGC 
TTGCCCCTATTAAAATTCCATAGGCAGAGTTGTTTGCTATGATATTCTTTGAGATATT/VA 
ATTCTAAGCTAATCCAACTCTCATTTAACCCATAAACCTCAATTCCTCCAACTTTTCCAC 
CATAAGTTGGATTTGGACAGAGATTGTTGTTTATTATTTTATTACCCTCTATTATTATAT 
ATCCATTATCTTGATTATAAACTCCATAAGCCCCAACAGTTATTCCTGCGGTTATATTTC 
CTATTGTGACAGTTAATCCATTGTATTG7VATTGTGTTGTTTATAATGGATATATTTGTTC 
CAATCCAATCCCAAGAATTCCATCCATTTGCCTCTTGAATTAAAATTGCCTGAGCATCGC 
TGTATTGAATCGTGTTATTAAATATAGAGACATTCTCAACTCTTCCTCCTATGTATATTC 
CATTACCGCTATTTTCTTCAATACCATTATTTGAAATTATGTTGTTTTCAACTTTAACAT 
CACATAATGTTCTGCTCCACAACCCTTCTAAAGAAATACCATTACCCAAGTTGTGAGATA 
TATTGTTATTTAAAATTAAAACTCCCTTGGTATAGTTTCCACTGATTTTTATTCCACTTC 
CTGCTGGGTCTCCTCCAATCAAACCGTTATTGGTTATGTTATTATTAACTATGTTTATAG 
CATTTACTCCATAGATGT7VAATTCCATCTTTGTAGGAGCCGTTTATTAGTGATTTGGATA 
TATTTACATTTGAAGCACTTCCCAACGAATAGATAAACAACCCATAGCTTCCGCTGTTTA 
ATACACTTGAGTTATAAAGTTTTAAATTATTCAAAAATCCATCAGAAGGGACTTCAATAT 
CAATACTGTTGTTTATTGAATCTCTTAATAATGAATTCAACACACTGAGGTTTGATAAAT 
TACCATAAGAGAGTATTGAGTAATTGTTATTATACAATAGAGAGTTTTCAACACTTACAT 
TTTCTCCATT/UVATATTGCAATTCCATCTGTGCTTGCATTAAATATATTTGATGAATTTA 
TATATACGTTAGATGAATTAACTATTTCAATTCCATAGGCATTATATGAAATATTCGATT 
TTTGAATTTTTGAGATGTAGTTTTCTTTTAAATATATTCCTATGCTATTGTTCATTATAT 
TTGAATTCAATATTGAGGAGGATGAATTCTCTAATAATAAACCTTCATATCTATTTTTAT 
ATATAAGGGAATTATTCACCAATATGCTTGATATATTTGCATAGATTCCAATAGAGTTAT 
TAATTATTGAAGAGTTAAGTATTTCTAAAGTTGAGTTCTTTGAATACACTCCCTCATAAA 
CGGAATTCTTAATCTGAGAGTTTATTAGTTTTATTCCATTTCCATCTTTATATAAAACCA 
ATCCTTGATTACATGAGCTTATAGTTATATTATAGATTGTtATGTTTCCAAAACCACCAG 
CCCAGTTTGCCCAATAAATCCCTACACCGTTTTTTAAAGATATTTGAGAGTTATACAACA 
TAGCCCATATTTTGTTCAACATTGAAATTCCATATCCTCCAGAGGCGTTTATTGTT7VAAT 
TGTCAATATACAATGGATAATTCTGCAAACTCCAATCATAATCAACTAACTTAACCCCTG 
CATTTAATATTTTTCCATAAATTCCACAACTCTTTAATGTTAAATTATTTAGCCCAGTTT 
CATTTCCTAATATTCCAACACTGACATTTTCAAAGAATATATTTCCAATGTATGTATCAG 
CCCCCACATTGTCAATAATTGGATTTGGAGATATTTCTTCGAGATTTAAAACATTATCCC 
CATTTTTCACATATGTTGGCTTAACGTATTGTAGGTTTTTTGCTCCTCCTGTTCCTCCTG 
TGAATCCGAAATAAGTTGAGTTTCCTATAATTTGGGTTATGTCTTTATTCCATGTTAATG 
CTAAATTGCCATCGAAATATACTTGGAGTGTTTTTGTTGTTGCATTCCATACGATTTTTA 
TTAAGTGTTCTCTTCCATCCTCAACATTACCTAAATCGTATGGGTTTGGTGTTGAGTAAG 
TTAAGGAGTTGTAAGTGTGATTTAAGTTCCCATCAACATCTATTGCAATATGGTCGGTTG 
TTGCTGGGCTGTCAAAATCGTTAAGCCAAGTATCAACCTCCACCGCTACACTCGGAGAAA 
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TTCCACCATAACCCAAATCTCCTCCAGTTCCACCTAATTCGTTAGTCCCCAACGATTGCA 
AGGTAAAGGTTATACCATCTGCTCCATCAGGATTGTCTCCCAAATACGCATAAAACTCAA 
CAACCAAATCCTCAGATAGATTAACCGGCTTGTAATACCAAACACTACCTTTTTGGTTGT 
AGTCATCAGGCGTTAGTATCAAAGTTAGATTATTTGAATTATTTATGTATGCGTTACCGT 
TAGCTATCCATTGTAATCCTGATAAAATGATGGTTCCATTAATCTGAGTTGAACCATTGA 
CAATAGTCAAATTGTCTAATATTTTATTTCCACTTGTATAAATGTAGTGATTTCCATrc? 
TTGCGTCGATATTTGGTATTCTGAAATAACTCTCATCTTTCCCATAAATTGCATTTGCA? 
TTTTTATAAATTGGGAAAAACTACCCTGCCCCGTAGATTTCGTATTTGTTAT?IJI?C^ 
AGCTGAATCCAAATGTTATATTTTTCCCACTATATGAATTTAGATTTATCAAACAGTAGT 
GTTCATATTTTCCATCTTCCCAATTATCTTCTTCATTAGGGTCTCTTCCTCCAAAcI?S 
CAACTACTCCATAAGTTGGATTTATTATATACCCATCCCCATTTTTTACGTATATTGGCT 

taacatactggagattttttgctcctcccgttcctcctgtgaatccgaJgJJJgSgaS 
ttcctataatttgggttatatccttattccatgttaatgataaattgccatcgaaI5aS 
cttggagtgtttttgttgttgcattccatacgatttttattaagtgttctcttccatcS 

CAACATTACCTAAATCGTAAGGGTTTGGTGTTGGGTAAGTTAAGGAGTTGTAAGTGTrCT 

ttatgttgccgttaacatctattgcaatatggtcggttgttgctggagca??S?cg? 
tgagccaagtatcaacttcaaccgctacactcggagaaattccaccataacSJJatc?? 
ctccagttccacctaattcgttagttcccaacgattgcaaggtaaaggttata??J?ctg 
ctccatcaggattgtctcccaaatacgcataaaactcaacaaccaaatcctcagaSa? 
^^^S^^^^^^^<^t^^taccaaacactacctgcttcaccataatcatctgttg™S 

GCTTATCTGGGAATATTGAAGCATTTCCGTTGGCAATCCATTGAGAAGAGTTTATTCrAr 

tatatactgtttgatatgtttcttcagcccagatgtcatttttSc?J?aSgaSggS?a 
aacctcttgtagttcctacagttcttgaattcactaccacaaagtaagtttttgaggIg? 
tataaactaagaaggaataatgaccaaaaatatctgttgttgttgaatttactatggtat 

^I^^^^'''''^''=''^^^^^^^^^'^'^^TTGCTGTCTTCAAGTAGAGATAcSSA?rccA? 
AAATTCCTTTATCTTCACTATCTTCTTTTCCAAGGGTTCCAAAGTCCTCTTTTACATAGC 

cctgaatttcatatccgcagtatatggtatagtttttcttcgaaattacaccattaga55 
cgattccagttattgtaataagatattttcctgactctgggaggctaaatgagtaJ??m 
agagcttccaaagtgatggggagtttttatctatttcctgtagtagcattgaS??2a 
tatatacactaccgtttggataatacacagttatatttgccccgcttatatcg?a^ga^? 
caatagggtctgtaatatttgcaaatattgtaacattttcatttggaagata^acJ???? 

r^^^^^^''^^'^''°''^^™^^"^^^T^GTC<^TAGTGTTTAGcS??A?J??GJI?J 

gatatgttgaattatgatagatgtttattgagttagaagatatttgattttccactctca 
aaaccaaatagtagtttttgggtattgtaattattgaatctaaggttatattgaaIa?g? 
aagattttatagtatcatctaagtatagatactcaacatcacttcctSgta^J^? 
^^^^''''^'■''''^^^^''^"^^^^'^c^c^ttttatgcatttcagttccaaata??! 
cgtttgggtcatttatataaagcaaaattggaatttttcctacaactgtaaagttgt??J 
cgaatcttggatattgtatccatgaagctaatgaattgctattaattgttgtSIg?5g5 

TTTTTTGAACTGAAGGAGGATATGGTGTTGATGTCTGGAATGTTGTTTTTCCAAATATTr 
ATGGATTATTTATATTAACAAATTTTAATGTAGTTGTATCAATCTCTCCTAATGGJSr 

ttgaaggaattgtctttgagacagttaaatttatctctccagttgSS?^gmaS^ 

GCAAATTATTTGAGTTTAAATCATAATTTGGGTTTATATAATCCCASTTSArcJrrlT 
J^?^^;^^7^^G^TATTAAAATGGAGTTGTTGTATATT???S?Sc?Scc?ScS^^ 



^^I^S^3^™I™I^TE^r^TTCC'^CCAACTCCAAAGTTATATAATGTTATATTGT 

'G 
,T 



AAAATATCTCTTCTCCAACATTTCCAGTCTTTTCCTGATAATTTGGTTGGACTGAAAATG 

atgttatatagattataatgcttttttcatcgtttgatgtattctggtcatSgg^It 



I™Sri^?^!!!?!S!5!5TT^TATTAATTGACACATTCAAGTTGTATGCATCTACCA 

AAAGC 



ATCCATATAAGGCAATTGTTGAATTTATGTATATTATTGGGCCAATATTTGGrTTAAarr 



CTACCCAAATTTCGCCAGGTTTTAATGAATCTTTAGTCCATGCTAATGSSJSCTrrAT 

caccctcatatgatgaatcattatttaaattatcatatctaItaSgJJcacgSc??? 
^^I^^^^^^^C'^tcatgttcgtaacttgggatatttgatttScctSg?™?^ 

TATCCCCAACAGGTGCGTTAGAATCGTAGCCATACACAACATCArSMASA^Tl^ArT 

^I^$^:^^^^^tccccaccagcttcctctaaaattccaatccatccc??ggSIISS 

:^^"!^™^*''°'=^'f^TTGTGGTTGGATTTTTTATGTAGTATATGSAGcSJ?^5J?lT 

I^I^I$^'''''''^''^''^^^^^"GA<^'^^TATTTAGCTCATTATTA?5?JJ^GG^ 

ACATATCTGTTATAATTACACTCTCTAAAATCCCATTTGGAACGGTATTTA^gSat^^ 

CTGTCCTATTTATTTCAGATAGGTAATAATATGTCCACCATCCcS^SI???G^^ 

CTAAAACCCCATACTGTCTTGTTAATGCTCCACTTGTSATTS^SS??SGlIr 

cagcgtaagcatcaatattatttgcataattgtaatctccagtc??ccSS?gg5?^? 
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AAGTAGCCACGGTAGTTTCATCATCCTCCTTATTATATGGAAAGACAATTGCTGAAATCT 
GCCCACAACCACCATAATCTTGCCCATTATAAGATAATTCAACAGCTATTGAATGATTTG 
GAGCTGGAGGAATTAAAGATATATCATATCCATAAATAGGATCGTCGTAGAAGTATAGTG 
TGTTATTTTCAATATACCATCTTACTTCTCCAATAACCTCTCCTGTTAATCTATTTATTG 
AAATGTTATTTCTTATTACTCTTCCATCATCTCCAACAATTTCTTTAACCCTATAACCTT 
TTGGAATCTTACAGCTGAATCTATACCAAACCCCATCTACTTTATTATTAACTTCAACTC 
TAATTTTATTACTGCTTATTTTTTTCACTATCTCAGTTGTATTTAGAATATTTTCAGGTA 
AGTTGTAATCTTTTAGAATTTCTTTATCTGCTTTAAGTTTGATTTCTACAATACTATAAT 
TGTAAATCTTTACTATTTTTGAAATCTTGTATTTTTTTCTTTTTTCGTATCTTTTCAATA 
ACTTCTTAATCTCTTTTTCACTACCTTCAAGTTTAATAATTATCTCCTTATTGACTGGGT 
CATAGGAAATTAGCTTTTTAATTTCTTCTTTATTTTTGAATTTAACTTTTATCTTTTCTT 
TTGCATTTCCACAAATT7VAAATTATTTTTTTAATTTTTTCTGGTTTTGAGATATTTAACT 
TGGGTATTAAATAATCTAAAGGAATTTCAAAGGCCCCATTTATTAGTGAAATATT7VATTT 
CTTTTTTATCTACAATGCAAGTTATAAACTTCGGTTCTTTAACATAGTAGGATACGTTTC 
CAACGATAAATCTATTATCTATCAATTTTGCATTAATTTTGTAATAATCTACAGCAAAGG 
TTTTTTTAATCCCATCAACAACTACTGAGTAATTTCCTAATATAACATTCTTTTTTAGTT 
TTGTGGATAAGATATAAAACTTCCCTTTTTTATGAACTTTTAGATGTATTATTTTATTGT 
TAGGGGTTACAATATATGCAGTATGTGGCTTAAAATTTGTTTTTATAACTATTTTTTCTG 
ATGGGAAATAAACGTTTTTCAATATAACGGTTTTATTCGTTGTCGTATTTTCCTCGATAG 
AGTAATTTAAAAATATTGTTCTATTCAATGTCTCATTTTTGAATTTTGCATAAATAATGA 
TTGTTTTATTTAGAATTAGAGGATATACCAAGTATTCATGTTTTTTTATTTTTTTAGTCT 
TAACTTTTATATTGAGGTCTAAATATTTAGCATAGGGTTTCCCATTTGTTTTAACAATCA 
ACTTAAAATTGTTGAATGAGACATTTAGATAATGTAATTCTTCTTTTTTTATGTTTGTTT 
CTTTTTTTTCTTTTTCTGGTTTTGTGTAATTTAAAATAACTGTTCTATGTAATATTTTAT 
CATTGAATTTTGAGTAAATTTCTATTGGAACATTTAAAACTATTGGAGAGATTATATATT 
CGTTATCACCTATTTTGATAAACTTTAAGGGCGTTTTATTTGC7VAATCCAAAAACTTCTC 
CATTAGTTTTTACTATAACTCTATAACCATTAATAGTAATGTTTAAATAACACTCATTTT 
TTTCTTTGTTTGTATCTGTTGTATTATTGATTATTATGCTACCACTATTATTAGTGGTGT 
TTGTTAAGGAATTGTTTAACTTATATATCAATGAATTATTTAATTCAAAATTACTGTTGT 
TGTTTGTTGTTGATACATTTAAACCTAAACTCACGTTTGAAAGCAATAAAATAGTGAATA 
CAAATATTGAAATATTCTTT7UVATAGGATTTCATATTATCCCATTGAAATTTTTAACACT 
GTTCAAAAGTAAAGTAATGTATAGGTAGTATATAAAATTTATAGAACAAATATTAGACAA 
TTATTGAGATTATTAGTTTTTATTCTTCTTATCTTCCTTATTTTCTTTATTTTTTTCTTC 
TCAACTCAGTATAGACAAATATTATTATTAATATACACAATATGCTTAACCAAAACACCA 
CATATCCATGAACTGGAAGTTTCGGTAATGGCAATATTGAAAACTCAGCATTTATAATAA 
GCAAAGTTTTTTTAGGGATTTGGAAATTATTTCCATCTGGTTCTTGGTAGGTTAGATAAT 
AAGGAGAGACAAAGTTTATATGGAAGTTTTTTATATTCAAATCCCCTCCAATAGGAATTG 
AGACAGTTTTAAACAATATTCCATTATCAAC7VATATCAGCATCATATTCAATTATTAGAG 
TTATATTTTTTCCACCATCTATTGGCTCCCATACTTGATATGTTATAACTGAATAGGTAT 
CTTTATAAGAGACATTTACATCCATTGGATATTTTTTATCCCCAATTATATAATAACCCC 
TTAAATTCTCAATCTTTACAGGtTTTTTTCTTTTGTAAATGGTATTGGAATAATCAGTAT 
TTTCTTTGACTCTTCTTTTTGTAATCTTAACTCCCCAATTCCTGGAACAAGAGGATACTT 
AACAAGATTTTGAATAGTTATAACATTTGTTATATGTGCAGGATTTTTTGTTAAATCAAC 
GGTCATATTATAGTTTGTTATTTCCTCAACATCTGCAAAAGTTGGTGTTATAAATGCAAG 
GAAAAT7UVAGAGTAATGCAAATCTCTTCATCATCATCCCCCAACTTTATTTATTATTCCT 
TATATTCCTCTTTGTTAATATACCTAATCCAAAGATAACGTTTATCAGTATTAAAAATAT 
CTCAAAGTTGTTTTCTACTGTTCCGGCTACTGTTGTTATTTTTGGTGAGGTTGTTGGTAG 
TAGTGAGTTTGTAGGGTCGATACCAACGATAAATGCGTCGCTTGGGTAGAATTCGCCAGT 
TCCGTTTAGTTTGTAGTGTATGACTACTGTTTTATTTGCAAGTATTTCAGCAGTGTCGTT 
CCAGTTACCGTCCCCATCTGCTCCTGGATATATTGCATGTAACGCCCACCACATGCTTAA 
ATTATACCTTGGATTTGTTGTAATGGTGTGATTTCCTTCAGCAATCAACATACTTGATTG 
ATTAACCCACTCATCTGAGACGGTGAAGTTCTTAGGAATCAAATCATAAACATACACATA 
CTCAGGAGTCTTCACACTACCAATATTCTCCACAACTATATAAATATCATAAGTCCCATC 
CGCATCCGGAACAATATGCTTAGTCACCTTAATCA7VATAACTACCCACAACATAAATCTC 
CTCAACAACAACATAGGAACTTCCTATTTGACTTACTTCATTTAAAAGGATATAATCTTT 
TTTTGACAGTGTAAATGAGCAGTTTGCCCAAACGATTGGAACTCCACTAAATGTGAAGTT 
GTAGGTTTTAGAGTTCCATACTTCTCCTGGAGGTATGTCAATATTTGGAGTTATAGTGTA 
ATTACTCCCATCTATCCAGATAGATTTGTTAAATGGATTCCAATACAACTCATAAGCAGA 
TTTATTTACTGCCCATATATTAAGATTCGTTAAATTAAAGGAATATGATTTTGCATCATT 
TTTAAATGTTACATTTTCTATCCAAATGTTTATTTCTCCCGTAGTTCCATGATCTGTTGA 
AATACTATAACTTCCAGATGCATAAACACCTTTTATAGATGTGTTTGTATTGGTCCCATT 
GTAJVTTAAATAGTATTTTAGCAAATCCATACTTTGCTAAAATATTATCCCTTTCACTATA 
TGAATAATTTCCCAATACATTAAATACCAAAGTTGCACTATCATTACTCCAATTTAAGGT 
AATATTTGTCCAATTTATTGCATTTAGATTTGAGGTATTGTAATCTTTTTTGAAATAGTC 
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CCCATCAAATAAAACTGCTGTTCCTTCACTTGAATTTGCAGATGTTATTTCAAGATACTT 

CCAGTTTTTATCTCCATAAAAGTCGAAGTTTTTTGAATTCACTTCCAAATCATCAACATA 

ATAAACATAACCTCCATGAATAACAACCCTATCAAACTTTGTATAGGTATTGTCTATTGT 

TGAGACAGTTGCGGCTAAGGAACCATTTTGATAATAAGTTGAGAATGTTATTGTTCCATT 

TGAATAAATTTTTAGTTCGAAGTAATACCATTCATCCTCTGGAGGATTCCAATAAACTTC 

AGGACTAATTTCTGTAGGATTTCCATTAGTTCTTCTATCAATTGATATGTAATTACTGTA 

GTGATTTACCTCAAATGAATATCCATCAAAATTCTCATCCTCCAAACCAATCCTGTCTAT 

AGGGCCTCCTCCCCAGTTGCTCGGTCTATATACCCATCCACTTATAACTACATCCCTTCC 

AATTTCTTTTGGAAGTAATTTGTACCCTCCATTTGGATCGTTGTTTAAACTTGTTGAAAT 

CCCATATTTTTCTAAAGAGTAATTTCCAGAATGAGATTGAATAGATGACCATTGAACTAT 

CCCGTTCTTATACTGATTCCAACCAGTCCAATTTTCAAAGTTATCATAAAACTGCCCATT 

ACTTAGATATTTTATAACATTAACATTGACATTTTCTCCATTAGGAACGAGATTTTTGTT 

TAAATATACAATTAAATTTACAGTCCATTCTGACATTTTATTTGCTGGTATTTTTGTAAC 

ATCGTAAGTTTCATTTACTAATATTGGTAGGGTTTGAGATGCATCGATATCAAACTCATA 

TTCCACATAGCTGTTATTTGGTAGTATTGGGATGTGTATGTATGTGTTTGCATTTGGTAA 

GTTTGTATAAGCAGGAGCAGAACTCTCAATAAAAACCCCTTTTGGAGTTCCATTATAAAC 

TAATCTCAAACCACTTGCATTATTTTTTATATCCACTGCTACCCACACATCGTTTAAAGT 

ATCTTCTTTATAAGGGGCAGTATTTTCAATAATGATATGTCCCGTTAAACCATAGGAATA 

GTTTGTTTTTCCTGTTCCATCCACAGTAGCAGTTGCGTTGTACTCTTCAATATATCTTAC 

CCTTAGTGGAGGATATAAATCTGAAATCCTCAAATCATCAACATAGTAATCTTGTCCTCC 

ATGCACAACAACTCTATCGAACTTAGTATAGATGTTATCTATTGCTGAGACAGTAGCTCC 

TAATGAACCATTCTCATAATAAACTTCTAATCTTAAAGTTCCATTTGAGTAGATATAAAA 

CTTAAAATAATACCACTGATTTTCTGGAGGATTCCAGTTAGTTATTACTAAAGTATTTAC 

AGCAATACCATTTTCTCGAGTTTCTATTGCAATTTTGTTATAGTCGTGCTCTATTCTTAT 

AGAGTATCCGTTGAAATTCTCGTCCTCAATACCTATTCTATCCCATCTTCCGCTAACATA 

TGGCAAAGGTCTATAAATCCAACCTTCCATTACAATATCCCTTCCAATTTCCTTCCCAAT 

TAGTTTGTATCCACCATTTGGGTCATCATTTAGAAACTTTCTAAGGGAATATATTCCTGA 

ATGTGCATAATTAGAAGATTGCTCTACAGCCCCACTACTGTAGTTATACCACCCACTCCA 

ATTTTCAAAATCATCGTAGAATATTGTTTTTAATCCAGAAACACTGTTTAACATTGATGC 

CATAAATACGAACAGTAAAATACATAAATATATGAATTTTAATTTTATATTTGGCATGCC 

CCACATCACCCATATAATATCGATAAAATTAACTTAATGTCAAAAATCATATTTGAATTT 

AGAAAAAGAATTATAAAAT^TAAAGAAAATTAGTTTTACATTACCCTTCTTATTATGATT 

CCCAACCCTAC/^GTAGGGTCAATAATGCAAGGAATGGTTCTGAGTTGTTTTCTACTGTT 

CCGGCTACTGTTGTTATTTTTGGTGAGGTTGTTGGTAGTAGTGAGTTTGTAGGGTCGATA 

CCAACGATAAATGCGTCGCTTGGGTAGAATTCGCCAGTTCCGTTTAGTTTGTAGTGTATG 

ACTACTGTTTTATTTGCAAGTATTTCAGCAGTGTCGTTCCAGTTACCGTCCCCATCTGCT 

CCTGGATATATTGCATGTAACGCCCACCACATGCTTAAATTATACCTTGGATTTGTTGTA 

ATGGTGTGATTTCCTTCAGCAATCAACATACTTGATTGATTAACCCACTCATCTGAGACG 

GTGAAGTTCTTAGGAATCAAATCATAAACATACACATACTCAGGAGTCTTCACACTACCA 

ATATTCTCCACAACTATATAAATATCATAAGTCCCATCCGCATCCGGAACAATATGCTTA 

GTCACCTTAATCAAATAACTACCCACAACATAAATCTCCTCAACAACAACATAGGAACTT 

CCGTACTTTGTTGAGTATTCATTAATACTTCTATTTATTAATGTTATGTTCTCGTCTGCT 

ACCTTAAATGTACAGTTTGCCCAGACAACTGGAATTCCATCAAATGTGAAGGCATATTTA 

GTTGAACTCCAAACGCTTCCTGGAGATAATATTTCATTAGGTGAGGACGTCTGTTTTGAA 

TTAGGGATTAATAATGTTATATTAAATGGGTCTAATATTACTGGATTACTACCATTTACA 

GCCCATATTGTCACATGAGTTAAGTTAAAGTAGTAACTTGATGCCTTGTTTGATACATTA 

GCACTTTCGTACCATATTTCATACTTACCGCTTGATGCATTTAAGAATGGTCCTTCCTTG 

GTTGCACTTACTCCACCATATCCTGTTGCATAAATCCCTTCAATTTTTGTTCCTGATTTA 

GTTCCATTGAATTCAAAGAAGATAACAGCAAAACCATATTTCATTAAGGTTCCTGTTCTA 

TTTGTGTAAGTGTTGTTTCCTGTTATGTTGATTGTAATGGTTGCATTTTTAGTAGTATTT 

ATTACGACTCCTGTCCAGGTTAATGAATCGTTGTATCCTGGTAAGAAGTAAGGACCATCC 

CAGAGTGTTATTGAACCTTCGTTGGCTATAGCACCTGTTATATTTAAGAAATTCCAAGTG 

TCACTTCCATAATTATTTGGGTCGTTACTTAGATATTTTGTCATAATAACAGAAACTGGT 

GTATCTGTTGCCGGTAGTGCTGAAACATTTCTGCTTATGTTTAAATAGACACTCCAATTT 

GATAATCTTTCTGATGGAATTTTAGTATCACTATATGTTTCGTTGATTATCAATGGAACT 

CCAGTTATTGATTTATCTATAGCAAATTTAATTATAACATAGCTGTTATTTGGTAGTATT 

GGGATGTGTATGTATGTGTTTGCATTTGGTAAGTTTGTATATGCAGGAGCTGAACTCTCA 

ATAAAAACCCCTTTTGGAGTTCCATTTACATAAACTTCTGGTCCAGTTATGTTGTTGGAT 

ATATTAACTGCCACCCAAACATCGTATAAAGTATCATTTATTGTAGTCCCAGTGTTGTTA 

ATTACAATATATCCAGTTATACTTTCTATTGTTGAAGATACTAAGCCATCACCTGTAGTG 

TTACCTGTTATGTTATATTTTTCGTAATATGCCACATATAGTGGTCCATTATCTCCATAT 

CCAAATACAGTCCCAATAAACAGCAATGACATTAACAAGGCCATAAATATTAACTTTCTC 

ATAACTTCACCTCATGCATTTGTGTGAAATAGGGGAACATTCTTAAGTAGGTAGTAATAA 

TGTAATGTCTTCCGTTTGTATAAATAATTTATGGAACATTTTTTAGACATTTTTGATTTT 
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TCAAAAATTTAGAAAAAGAACCCAAAAAGTCCAAGGTTTTCAATTTGAAAATAATAACAG 

CCGATATATAAACCTTTTGATATTAAAATTATCAATACCTAATAAAACATTTTAAAATAA 

GCAAAAATATTAATTCAATAACATATTGATTCCTTCCATTACAGCATCTACAGAAGCCCT 

TATTATATCAGCGTCTGATTTTCTAACTTCAACAATTTCAGTTCCTTTTCTTAATTTAAC 

AACAACCTCTATTAACGCATCAGTTCCTCCACCAATTGCTTCAACTCTATACTCTACCAA 

CTTAATATCTGCAACTCCACTTATTGCCTTTCTCACAGCATTTATTGCTGCATCTACCGG 

TCCAACACCATAAGCAGTTTCTATTAAAGTTATATCTTCTCCTTTATAATGGAGTTTAAC 

AGATGCAATTGGTGTTATTTTATTTCCAGATACAACAGTTAATTCATCTAATTTGATTTT 

CTCTTCTACCAATTTTCCAGTAACTTCTCTAACTATAGCCAACAAATCAGCGTCTGAAAT 

GTATTTACCCAAATCCCCAAATTCTTTAACTCTTTCATATATTTTATTTAATTGCTCATC 

ACTTVACGTTTATGCCCATCAAATCAAGTTTGTATTTTT^AAGCTTTTCTACCAGAATGCTT 

ACCCAAAATAATTCTTCTTCTATTCCCAACCATTTCTGGTTTTATTGGCTCATAGGTTTC 

AGTATTTTTTATTAATCCATCAACATGTATTCCTGCTTCATGAGCAAATGCATTGTCCCC 

AACAATTGCTTTATTTGGTGGAACAGGAAGTTTCATCAATCTTGAGACAATTCTTGAAAC 

CTCATATAACTTTTCCATCTTTATCTTAGTATCATAGCCATAGAGTATTTTTAAAGCAGC 

AACAACCTCTTCCAATGAGGCATTTCCTGCTCTCTCTCCAATACCATTAACTGTTACGTG 

GCACTGAACAGCTCCACCTAAAACTGCTGAGCAAGTATTAGCAGTAGCCATTCCAAAGTC 

GTTGTGGCAATGAACTGAGACCGGTAAATTAACATTTTCAGTTATTTTTTTAAATAATTC 

CTGACTCTTTTGTGGAGTTAAAACTCCTACTGTGTCACAAACACAAACTCTGTCTGCTCC 

AACCTTTTCCCCTTCATTAAATAGTTTTATTAAGAAATTTACATCACTTCTTGTTGCATC 

CTCTGCAGATAACTCAACAATCAATCCATGTTCTTTAGCATACTCTACAGCCTTTAAAGC 

TGTCTCTAAAACCTCATCTTCTGTTTTTCTAAGCTTATATTTCATGTGTATTGGAGATGT 

TGGCACTACTAAATGGACACTATCTACATCACATTCTAAGGCAGCATCAATATCTACAGG 

TAAAGCTCTAACAAATGAGCAGATTTCTGCATTTAAACCTTCTTTTGTTATTAATTTTAT 

TCCTTCTCTCTCTCCTTTTGAAGTTATAGCTGAACCTGCCTCTATAACATCAACTCCAAG 

CTCATCCAATTTTTTTGCTATCTCTAACTTATCATTTGGTGTTAAAGAAACTCCTGGTGT 

TTGCTCTCCATCTCTAAGTGTTGTATCAAATATCCTTACCATCATAACAATCCCTCATAA 

AAAATAATTTAATGAAATTTAAATACTCATAATGAATCTGATGATAAAATTGAATCATCT 

CAAAGATATTTGATATTGTATATTTAAAATTTATGTGGGAAATAGTTCTGGACTAATUUVG 

TTGGTAATATACATCTTTAAATTTAAATTTATAAATTAAGATTTCTTTTAAAGATTTTAT 

TCCTGCGAAAGCCCCATTAACTTTATTAATAATCTTTATAAAATTTTTATTATTTTTGAA 

AGATACTATACGAAAGTCATAAAATACTCGCATTA7VAGATTTAATACAAAACAATAGCGA 

AATTTTTATATTTGTTAATU^TTTACTTACATTAAAACAAGTAGTTTTTGCAAAAGTTATT 

AAAATTAAAAAATACCTTACTAAAGGAAGGCATTCATTACTACCCATATATTCTTTTAAA 

ATGCTCCGCAAAAACTAA7VAATGCCAATTTGGTGATAAAATGGAAAGTTACATACAAAAC 

TTATTTGCTGAGAGAATTGGTGGAAAGAAGTTTGGGAAAGAAGATGTAATTTACAAGTTT 

GAGAAAATTAAGAGAGCTAAGCAAGAGGCAATGAAAAGACACCCTGATATGGAATTAATT 

GATATGGGTGTTGGAGAACCAGATGAGATGGCAGACCCGGAGGTTATAAGAGTTTTGTGT 

GAGGAGGCTAAAAAATGGGAAAACAGAGGATATGCGGATAACGGAATACAGGAGTTAAAA 

GATGCCGTTCCTCCATACATGGAGAAGGTTTATGGAGTTAAGGATATAGACCCAGTTAAT 

GAGGTTATACACTCAATAGGTTCAAAACCAGCTTTAGCTTATATAACATCAGCATTTATA 

AACCCTGGAGATGTTTGCCTAATGACAGTCCCTGGCTATCCAGTTACAGCAACACACACA 

AAATGGTATGGGGGAGAGGTTTATAATCTCCCATTATTAGAGGAGAATGACTTCTTACCA 

GATTTAGAGAGCATTCCAGAAGATATCAAGAAGAGAGCAAAGATATTATATCTCAATTAT 

CCAAACAACCCTACTGGAGCACAAGCTACAAAGAAATTCTACAAAGAGGTTGTTGATTTT 

GCTTTTGAAAATGAGGTTATCGTTGTTCAAGATGCTGCTTATGGAGCTTTGGTTTATGAT 

GGAAAGCCTCTTTCATTCTTATCAGTTAAAGATGCTAAGGAGGTTGGAGTTGAAATCCAT 

AGCTTTTCAAAGGCATTCAACATGACCGGTTGGAGATTGGCATTTTTGGTTGGGAATGAA 

CTTATAATTAAAGCGTTTGCAACAGTTAAAGACAACTTTGATAGTGGGCAGTTCATCCCA 

ATCCAAAAAGCTGGAATTTATTGTTTGCAACATCCAGAAATTACAGAAAGAGTTAGACAG 

AAGTATGAGAGAAGGTTAAGAAAGATGGTTAAGATATTAAATGAAGTTGGATTTAAAGCA 

AGAATGCCTGGAGGAACTTTTTATTTATATGTAAAATCACCAACAAAAGCTAATGGTATT 

GAATTTAAAACAGCTGAGGATTTCTCCCAATACTTAATTAAAGAAAAACTTATTTCAACA 

GTTCCATGGGATGATGCAGGGCATTATTTAAGATTAGCAGCATGCTTTGTTGCTAAAGAT 

GAGAACGGCAATCCAACAACTGAAGAGAAGTATGAAGATATGGTATTAGAGGAGTTTAAG 

AG^lAGATTGGAGGGAATGGATTTAGAATTTGAATAATTGATTTTTATTTATTTTAAATTT 

TTCATATTTTTATTTTTACTATTCTTTATTTATATATTCGGATTAATAAAAATATCTAAA 

acctgttctaaaatttatttatactaaaatctccactatatacaatcaaatagaaaaaaa 

GAGGATGTAAAATTTTTCAAATTTTTGAAAGAAATGAAAAAAGGTG/W^GGTATGGATGA 

gtatgaaaaaatcatcaatgactttvaataccataaactcaaaagcaaaatttattggtat 

TAAGATTATTATGGTAAGAAGAATTATCGATATGCATAAAGATAATGATAAATTAATAAA 
AAAGGTATTAGAGGGTATAAAAAATACTGATCTTTATGATTTAGTTTTT^TGCATGTCC 

tgaattgaaaggagaaaggattaaagatgtttattttaagaagaatgattattttaatgt 
cattaaaaagacaatgagcagtgaaaatactgtattgaaaaatgtgttgattaatgatga 
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ATCTCCGCCCTAAAGATGGGAATTTTAGGAGATTATGGGTTAAACTTTCATCCTrTrrrr 

?^$^^^^^^'^=^^^'^^tt'^<^tgaatgcttatgtcataataataa^?SS?JgI^gaS 
tttatagtactacaaagtcataataggaacaaaattacatgaatattcaSSJ??S 

^^^^r?^:^^<=^'^T^^^<=«^TTAATATGAAGGCAACAGAAAACAGAS?iJJ?S 



ATAAATGAAATTCTTCTACCTCTATCAAAAAATTTAAAGAATGTTGAGGGATTTGTCATA 
GTCTCAAAGGATTCCCTTGTTAAAGTAGGAAATATTGACGGAGAAGA? 

^^^^^'^^'^'^''''^^^'^^^^'^^^'^^^'^^tagttcagagatgctctataaaagatttaatgat 

rATT^^^^3''''''"'''^"^^^^^^^C^T^'^^TAATCTTA?A?IIc?^^ 



TA 



attgctctttaaattttaaaattttaataggacttcatgggaataaaccat?ISSa 



^^^^S'^^^'^^ttaatgaatttataggcataaataaaccaataaacatatagctamtt" 

GGAGTTATACCTACATAATATACTACACAGTAAATTACGCAAAAAGATTATA^r^AlTlI 



GGGAGGGGTTAAATATATATAACAATAGTCAAAGCAGCTTTTTTATTTTT/^AATCT^ 

agaagttaagttccttctctccaatcttctaaatattttttttgctctttS^??^. 



GGATTAACTTCTGTAACTACGACCTCTGCTCCTAAGCCTTTAGCTCTCATTGCTACTPrT 

ctaccacaccatccataaccagcaacaacaacagtctttccagcaJ??^Sg??JgtI 

GCTCTCAGAATTCCATCTAAGGCACTTTGCCCAGTTCCATATCTGrSrclJJ^lGSS 
^S?I^I^!^^^''^'^T^T^^^T=^^TAACTGGAAATTTTAAAGSc?5?cm^^^ 



rTCAGTTCTCTTTGTATGCAATAAAAATATTAAATCACAGCCrTCATCTATAArAA^ 

tctggtttgtggtctaaaaccttgtttaggttttcataatJ??Sc?ISg^^^ 

nI^^^r'''''''^^^^''''^^^^^'^'r^'^"^GCACAAGCAGCGGSS?cI?cCT^^ 



™^^^^JS^I™5IIS!^GTGTAGAGCCATTCCTATTGTTATTCCTTTAAATGGC 

TTGCCCAT 
TTCATTTC 
RCTATTTT 



^J^I"r^^'''^''^"^^^^™^^TAAATTTAAAACAGGCATGTGTTGTTTTGCCCAT 

tt^^^^'''^'''''^''^^^^^^^^^^2T'^g=gtagcagagataStaaa5?ISS??5 

TTAGTGAAGAAAAGCTTTTATTATTTATAAATTCAA/ - "^^lAAArTACTATTTl 



J^^r''^'''^''''''=''^^^^^^*«™TATTAATAACTTAAACTSSlI^?AS 



^I2^!!S^!?ZJ^T^^^'^^GAAGATTTTATAAATACAATTAAAAAAGTTATTGATGT 

taacaaagttaatataat 

T AAAAACGT TA T T AAAAG 



IlAT^^n^^^rS^S^^^^I^eZT^^G^C^^^ATATATTATAAACAAGTTAAATCTAAA 

ACAGAACA 
AAAATATT 
AAGTATAT 



^^I^^JS^J^^I^J^EE^^T^-^GTTGGAAATGGTTTTGTAAGGACAGAACATGG 

3GACTTAAAATATTTTT 
iTTATAAAGTATATAAA 

GGAT TTAA A TPTTTT a a n t/^;^;^;.-; " ' "^GAAGTTTCTTATGGAGCTGGAGATAA 



l^if^^'^^^^'^^'^^C^CTGGTAGTTTAGGGGTTAGAATATT 
AGCTGATAGAGAATTTAAAACTATAAAATTGTTTGATGAATC 
GAGAGTTAATGATGAAATAATCTCTCAAAAACCAGAGTTTGA( 

TAAAAAATATGGCATTCCTTTAAAAGATTTATATAAGTTAATAAATATTTCCCAAATTZ.. 



TTAGAATATTTGATATAGAGAGAATAAC 



^^^^ 
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ACCAAAAAAT7VATTCATCATTTTCAGCATCTTCAATAATCTTATCTATCAAAGGTTTATT 
TCTCTTTCTTCCAAGATTTGGAGGATAATTTTTTATCTAATTCTTCATAGTCATCTATAA 
TCCCTTTATATTTTATAAGAGCTTATTAAAAGCCATATCCCTAAGCCTATTAAACATCCA 
GCCCAGCATCACCAAATATCATTCCAATCCCAAGACCTAAGACAGTAAATCCAAAAGTTA 
TCATCCTTCTAATCTTCTTTAATTCAGTGTAATTATTGACTGCAAATATTTCTGCCATTT 
TCATCCCCTTATAATCAAAAAAGTAAATATAATCAAAAAATATGGATGTAGAGATTTGGA 
AAGTTGTTTTAACAACTGCATCATATATTTTCAAATATTTATGACTTAGAGTATAAATAA 
TTTATGATGAGGGATTATTATGGTTGTTGAGGTTTTAAGATTAGGACATAGAGGAGACAG 
AGATAAGAGGATATCAACCCACGTAGCTTTAACCGCAAGAGCCTTAGGAGCAGATAAAAT 
AATTTTTACAACTGAAGATGAACACGTTGAAAATAGTGTTAAAAAAGTTGTAGAGAGTTG 
GGGAGGAAACTTTGAGTTTGTTGTTGAAAAACATTGGAGAAAATATATTAGAGAATTTAA 
AAAAAGAGGGATTGTAGTTCATCTAACT^TGTATGGGGCTAATATAAATGAGATAATGCC 
AGAGATTAGAGTUU^TAAGCAGAGATAAAGATATATTAGTTATAGTTGGGGCTGAAAAAGT 
GCCAAAGGAGGTTTATGAATTGGCTGATTATAATGTATCTGTTGGTAACCAACCACACTC 
CGAAGTTGCTGCTTTGGCAATCTTTTTAGATAGATTGTTTGAGGGTAAAACACTTTATAG 
AGATTTTGAAGATGCAAAGATAAAGATAGTCCCATCAAAAGATGGAAAAGTAGTTATAAG 
AGAAAAGCAA7UVTAAATAATATCAAAATATATTGGGGGATACTATGGAAATCCAACTTCC 
AGATATAGAGGAAATAAAGTTAGAGGATGTTTTGATAAAGAGGAGGTCAGTTAGGGAATA 
TTGCTCATCTCCACTGACTTTGAGAGAACTTTCTCATATACTATTTGCTGCCTATGGAGT 
AACTGATGAAAGGGGATTTAAAACTGTTCCCTCTGCTGGAGCAACGTATCCATTGGAAAT 
TTATGTAAATGTGAGGGATGTTGTTGGAGTTGAGGAGGGAGTTTATAAATATATTCCAGA 
GAGGCACTCAATTGTTAGAATTTTAGATGAGGAAGTAGGGCACGAATTAGCTTTAGCAGC 
TTTAAAGCAGATGTTTATCGCCATAGCTCCAATTGTTTTAATTATAGCTGCTAACTATGA 
AAGAACTACAAGAGTTTATGGAGATAGAGGATTTAGATATGTGCATATGGAGGTTGGACA 
TGTTGCTCAGAATGTATATTTAATGGCTACATCTTTAGGTTTAGGAACTGTATCAGTTGG 
AGCATTTTATGATAATGAAATAAGGGAGATTTTAAAGATAAAAGAATATCCTCTATTATT 
GATGCCAGTTGGTAGGAAGATAGAGTAATAGTGTCTTTCAAAAAACAA7VAAATAATAAAA 
GTTATTGAGAAAAATGGCAGGATTTTCACAGGTCATAAGTATTAAATAACGTGTTTATAT 
GTATGAGGTCATCAATATTCTTTATTAAAAATCAAAAATTTAATTTCTATAAAAGCCCTA 
TGAACGCTTTTyCCTAAAGGATAGCGTTCATTAATACATTATTTATCTCATAAAAGACAC 
TATAAAGGGTGGGGATATGATAGACACTCACATACACTCAGATACAAGAGGTTTAGAGGA 
TTTGGAGTTAATGGCAATGTGCTTAGATGGAGTTATAACATTAGCTCATGACCCATTTGA 
GATGAAGAACATTAAAGTTTGGGAAGCTCATGTAGATW^GCTTTTAATTAATGAGTTAGA 
GAGGGCTAAAAAGGTTGGATTGAATTTGTTTATTTGTGTAGGGATGCATCCAAGGGCTAT 
TCCTCCAGAGATTGATGAGGCTTTAGATAAAATAAAGAGTTATATAAATTATAATAGTAG 
GGTTGTGGGTATTGGAGAGATTGGTTTGGAGAAGGCTACAAAGGAGGAGAAGGAGGTTTT 
TATAAAGCAGTTACTTTTAGCTGAAGAGTTAAATATGCCTGCAGTTGTGCATACGCCAAG 
AAGAAACAAGGAGGAGGTAACTAAAATCATATTGGATGAGATTTCCACTCTGAATTTGAA 
AAATAGGGATATAGTTATTGAACACTGCAATAAAGAGACAACAAAATGGGTTTTAGATGA 
GGAGTTTTATGTTGGATTGACAATTCAGCCAGGAAAATTAACTCCATTAGAGGCTGTTGA 
GATAGTTAAAGAGTATAAGGACTTTGCTGATAAGATTCTATTGAATAGTGATTGCTCCTC 
AAACGCATCAGATGTTTTAGCTGTTCCAAGAACTGTTTTGAAGATGAAGATTAATGGTAT 
TGAA7VAAGATGTTATTTATAAGGTTGCTCATAAAAATGCTGTGAATTTGTTTGGATTGGA 
CATATAACAAAAACCAAAAATTAATTTTAAAAATCAATAAAAAATTTTATTAATAAAAAT 
AATAGTTAGGACTCTCCGTATATTTAATTTTACTCACAAAAAATAAACAGTTTTAAACGG 
CGATATTATGGCATACTGGCTTTGTATAACAAATGAAGATAATTGGAAGGTAATAAAAGA 
CAAGAAGATTTGGGGAGTGGCTGAAAGGCATAAAAACACTATAAATAAAGTTAAAGTTGG 
AGATAAACTAATTATTTATGAGATTCAGAGAAGTGGGAAAGATTATAAACCACCATACAT 
AAGAGGAGTTTATGAAGTTGTTTCAGAGGTTTATAAAGATAGTTCAAAAATCTTTAAGCC 
AACTCCAAGAAACCCTAATGAGAAATTCCCATATAGGGTTAAATTAAAAGAAGTTAAAGT 
TTTTGTGCCACCAATTAACTTTAAGGATTTAATTCCAAAGTTGAAATTCATAACAAACAA 
AAAGAAGTGGAGTGGGCATTTGATGGGAAAAGCAATGAGAGAAATTCCAGAAGAGGATTA 
TAAGTTGATTATTGAAGCTAAAGCTTAAAACCTATTTTTTATCCTTGCATCAAGCTCATC 
TAATGAATAAACACTTAACTCTCCAGTTTTTACAGCTTCTATTGCCTTAACAGCAGCTTT 
TGCTCCAGGGATTGTAGTTATATAAGGAATACCCAAATCCACTGCTGCCCTTCTTATATA 
ATACCCGTCTGACTTTGCCTTCTTTCCAGAGGAAGTGTTTATTATTAAGTGCATCTTACC 
ATCTCTCATTAACTTTAGGATGTTATCATTTGGACTTTCAGATATCTTCTTAACAAGTAT 
TGCTGGAATTCCATTTTCTCTCAACACTTTAGCAGTTCCTTCTGTTGCGTATATTGTAAA 
GCCAAGCTCATGCAACTTTTTAGCAACATCTACGATATGCTTCTTATCCCTATCTCTAAC 
ACTTATAAAGACATTTCCAACGATTGGCAATTCCATATTTGCAGATAACTGAGCTTTATA 
GTATGCCCTACCAAAGTCTTTATCTATTCCAATAGCCTCTCCAGTAGATTTCATCTCAGG 
CCCTAAAACAGGGTCTACTCCAGGCAATTTTTGGAATGGGAATACTGCCTCTTTAATTGA 
TACATACTTCGGCTTTGCAATCCAAACCTTCTCAGCAACTTTTTCAACATCATAATCTTT 
AATTAACTCCTCCAACTTTTTGCCGAGCATAATCTTTGTGGCTAACTTAGCCAATGGAAT 
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TCCAACTGATTTACTCACATAAGGAACAGTTCTTGAAGCCCTTGGGTTTrrTTrrnanTir. 

ataaacaactccatctttaactgcatactgcacgtttaaaagcc?SctSgS 



CTTC 
AAAC 



^^^^^^^<^=cctaactaaaacaggataaccaattcttttagctatS??Sgcc!^ 



gttgttatttatgattatagcttcaattcccatttcctttaaagctaJ^SS™^^ 
^I"^''''''''^^^^^"^^^ttttaggaacgtcttctttaactaJS?SIIgISt^ 

TAATCAATCCCAAAATATTTATGAATTAAATATTTCTTAATCcScSJJJSrrr^^^ 
GGACGTTGGGAAACTTTTCCCTAAAATCATTATTTATGTATSTSGSS?rAATl^ 
nII^™''^^^^^''^^^^=GCATATCGTATCATTTTATTATSSSASS™?? 



^^JJT^SE!Z^!"^"T^T^TTTAAGGATTTTTTTATGCTTTCTTAGGATTTCTTTTATT 

\TAT" 

TTCC 

_ _ _ i^ATAI 

:^I!™,^!TT^«^*^TCTTTTTTCATAATTTCCCATTAAGCATTCAATTTCTTAATTTCTTC 



JS^^^^'^^'^^'^TTTCATGGTTTCACACTATATACTATATTTCTTATTCTTCTTTAArrrT 

ctttaatgccaataatccattaattacacatataattccaaSL ^^^^ 

ISS^?n^!™e?S^T^A^G"^TGCTACAAATATAAGATTCATCAAAATTAATAT 



ttaccaatattct 



^^JJ^L^!?!SS^!T?EZ^^^^^^^^tctacaatattctt^c;tcttaatgaaaaa 

:ttctctaacgcctt 
['tctatctcttcatc 



cgtatagtctttatcctttccatctccaattatgccgaa? ^^'^^^^^ 



GlS?S^^I^?I^^I^"^jrr^JS!^SI5ITJT^CTTATCAAATGGC^ 



^^^^^^^^^-^'^^^'^^'^C'^TCTCCTTATTTACAGCAAATTGTATATTACAArrTrrA 



TC 



;gattcactt< 
:aacctccct( 
'aaaactcat( 
:ccattgggtc 
^taacctcaac 
ttatagaat; 
ttcctcctcc 
cctcagcaa^ 

CATTTVATTTC 
TAATATTAGjt^ 

::taaagctaa 
::tttctcaat 

:CATGTCTGT 
:CTCTTCCTT 



I^S^I™3ST^^Eni^??^^^^^GGGCAGAGCTTCTTGAGAC^^^ 

r; 

:i 
;g 
1 

T 

c 
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GTTGGAAGAATTAGGAGAAGATGTTAAAAAGAAAGAAAAAAAGGAGAAAGGTTTAGAATT 
ACCAAACGTTAAAGATAAGGTAGTTATGAGATTCGCTCCTAATCCATCAGGGCCTTTACA 
TATAGGGCATGCAAGAGCAGCAGTTTTAAATGACTACTTTGTTAAAAAATATGGTGGAAA 
GTTAATTTTAAGATTAGAGGATACAGACCCAAAGAGAGTTCTGCCAGAAGCTTATGACAT 
GATTAAAGAAGATTTGGATTGGCTGGGGGTTAAAGTTGATGAAGTGGTTATACAATCAGA 
TAGAATAGAGCTTTATTATGAATATGGTAGAAAATTGATTGAAATGGGACATGCTTATGT 
TTGTGACTGCAATCCAGAAGAATTTAGGGAATTGAGAAATAAAGGAGTTCCATGTAAGTG 
TAGAGATAGAGCCATTGAGGATAACTTAGAGCTTTGGGAAAAGATGCTGAATGGAGAACT 
TGAAAATGTAGCTGTTAGATTAAAAACAGACATAAAACACAAAAACCCATCAATTAGGGA 
CTTTCCAATATTCAGAGTTGAAAAAACTCCACATCCAAGAACTGGAGATAAATACTGTGT 
ATATCCTTTAATGAACTTCTCTGTTCCAGTTGATGATCATCTTTTAGGAATGACTCATGT 
TTTGAGAGGAAAAGACCACATTGTAAATACTGAGAAGCAAGCTTATATTTACAAATACTT 
TGGTTGGGAAATGCCAGAATTCATCCACTATGGGATTTTGAAGATAGAGGACATTGTTTT 
AAGCACTTCATCAATGTATAAAGGAATTAAAGAAGGTCTCTATAGTGGATGGGATGACGT 
TAGATTAGGAACTTTAAGAGCTTTAAGAAGAAGAGGGATTAAACCAGAGGCAATATATGA 
GATAATGAAAAGAATTGGAATTAAACAGGCAGATGTTAAGTTTTCTTGGGAGAATTTATA 
TGCAATAAATAAGGAGCTTATTGATAAAGATGCAAGGAGATTCTTCTTTGTCTGGAATCC 
AAAGAAACTTATTATCGAAGGGGCAGAGAAAAAGGTCTTAAAACTTAGAATGCATCCAGA 
TAGACCAGAATTTGGAGAGAGGGAGTTAATATTTGATGGAGAGGTTTATGTTGTTGGAGA 
TGAGTTGGAAGAGAATAAGATGTATAGATTGATGGAGTTATTTAACATAGTTGTTGAAAA 
AGTTGATGATATAGCATTAGCTAAATATCACTCAGATGACTTTAAAATAGCAAGGAAGAA 
C7VAAGCTAAGATTATACACTGGATTCCTGTAAAGGATAGTGTAAAGGTTAAAGTTTTAAT 
GCCTGATGGAGAGATAAAGGAAGGCTTTGCTGAAAAAGATTTTGCTAAAGTAGAGGTTGA 
TGATATTATCCAATTTGAGAGGTTTGGATTTGTTAGAATAGATAAAAAAGATAATGATGG 
ATTCGTATGTTGCTATGCACATAGATAAAAATAATTTTTTTATTTTTTAGATTTTAATTT 
CCTAATCTCTTTAATTTTTTTAGCTAAAAGTTCATTATCTTCATTTTCAACTTTTTTAAC 
CCTTTTTATCTTATCCTCTTCTTCCCATTTTTTTCCAGTGTATCTCTTAGCTATAACATA 
AACCTCAGCACTTTCTTTTCTTGAAGCTTGAGGTTTTGTAATATAAACCTTTTCAAAGTA 
TTTTTTAACTAAATTTACATAATCATCTATCATGTCTCCATAAAATACCTTAGCTACAAA 
ATTGCCTCTCTCTTTTAGCATCTCAGTAGCTATTTGTAAGGCAGTAGTTACTAAATCTAT 
TGAACGAGCGTGGTCTATATCCCAATAACCGCTTATATTAGGGGAGGCGTCACTTATAAC 
CACATCCACCTTTTTTTCATCATTTGGAATTAGCTCTCTAATTTTGTTCAAATTTTCTTC 
TAAGGTGAAATCTCCTTTTATTGCAACTACATTATCATATTCAAATGGCTTAACTGGTTG 
TAAGTCAATACCAATAACAAAGCCTTTATCTCCTACAATCTCTCTTGCCACTTGCATCCA 
TCCGCCTGGAGCACAACCCAAATCCAAAACTATCTTTCCTGGTTTAATAACGTTAAATTT 
TTCATTTAACTGCATGAGTTTAAAAGATGCTCTTGAACGATATTTAAGTTTTTTAGCTAA 
TTTGTAGTAGAAATCTCTCTTTCTTTGTAAAACCCATCTTTTATCTTTTCTTCCCATAGT 
TTCACCACAAATTTTAAAAATATTAAATGTAATTTTTAAAGAAATAATAGGTAATAAAAT 
AAACTTAGGAAAAGCTGATTCTTATGAGTTTGTGTAAGGATAGTATTTACATCCTAATGT 
CAAATTTATACTCAAAGGGAATGGCATATCTATTTTATTTTATAACTGCATTTTTATTGG 
GAACTGAAGCATTTGGTATCTTAAAGGGATTAATGCCAATAGCTGACACTCTAACAATAT 
TTTTCTCTTCTGGTATTCCTCCAGCCATAGCAAAATTCTTAGCTGAAGAAAAAGAGGTAG 
ATATTAACAAATATATTCCAATATTATATTTAATGATTTTGCTCTCAGTTGTTGGATTTA 
TCTTAACTCCTTATATAAAATACATTTTAGGAGGGCATTATTTAAATCTGCCAAATATTT 
TGTATTTTGCAGTAGGTCTTTGTGTTGTAGCTTCAACAGTAATAGCATTTTCAAGAGGTA 
TTTTACAAGGATTGTTAAAGATG7LAATATCTCTCCCTTACGTGGATTGTTGAATACACTG 
CAAAAGTCATATTGGTTTTTATTCTAACTCTATATTTGGGAATCTTTGGCTCTTTGTTAt 
CAATATCTTTGGCATATTTAGTAGGAGGGATTTTTGGGCTATATTTGATTTATAAGGCAT 
TAAAAGGAAAATTTGATTTCAAAAAATTAATTGACATTUVAAAATACAACAAAAAACATAT 
TCTCTAATTTTAACTTAGACATTTTGAGATATTCAATCCCTATTGCTTTAACGTCATCAT 
CATACAGATTGTTTGGAGATATTGATAATATAGTTATAATGTCCATTATGGGAGGATTTT 
GGAGTGGGATTTATGGTTACTCCTCTCnAATATCAAGAGGAATATTTATGTTTGCTTCAG 
CTGTTAGCATCCCTTTACTTCCAAGAATATCTAAAACTAAAGATTTAAGCTTATTAAAAG 
AAGGAATTATCCAAAACACTATCTTCTCATCAATTTTTGTTATTGGTTGTTTGTTTTTCC 
CTGAAATCCCATTGATAGCATTTTTTAAAACAGCTAATCCAGAAGGAATTTTATGCCTAA 
GAATTTTAGCAATCTCTTCTTTATTTATGAGCTATTATACTTTAATATCCTCTGCACTTC 
AAGGTTTAGGGTATGCAAAAATTTCTTTCTATATAATATTGTTTGGGTTGGTGTTAAATA 
TTATCTTAAATTTAATTTTGGTAAATGCTTATGGAATTGTTGGAGGAAGCTTAGCTACAT 
TAATAACATCAATATCTGTCTTTTTAATTGGTGTTTTTGCTATTTTAAGAATAAAAAAGC 
ATATTATTTAATTAGCTGATACTTATACTTTCCATTTAAAAGCTC7VACTTTCAGTCCTAA 
TCTTGCTGGAATAACCTCTATTCCAGTATTTTGGGAAATATATTCTGCCTCTATTTGAGG 
ATTTGTCATCTTAACTCCCATGTGATTCATTATCAACAACTCTGGCTTTTTGTTCATTGA 
GTTTATTAAATCAATGGCATCGTTAGAGCAGAGATGCCCTTTAATTCGCTCATTTTTCTT 
TCTAACAATATTCGCTATTAAAATTCT7VACTCCATCAAAGTCTTCAATTAGCTGAGGGAT 
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AAATTCAGTATCTGAAGTGTAACCAATATCTCCATAAATTGTTGATAGTCTAAATCCAAT 

ACCAAACGGGTCTCCATGTTTTGTATGTGTTGCCTTTATTGTTGTATCATACAACTCTGC 

AGAGTCTCCAGGGTATAAAACTCTAACCTCTTCAAGCTTTGATTGATGGTATTTTGATAC 

AACATACTCATATTCTCCAAAACCTTCAACAACTGATAAGCTACCTAAAAAAACTCCTCG 

CTTTTTTGTCATTCCTTGAGTTATAGCTTCAACAATAATTTCTCCATCAGTGTAGTGGTC 

TGGATGGCAGTGAGATATAAACAGGGCATTAGTTCTCCATGGAGATATTTTTAGCTCGTT 

TAATCTCACTATCGCTCCCGGGCCAGGATCTACATGCATTCTAAGCTCATTTGTATGGAT 

TCTAAACCCTCCTGTTGCCTTTTTTTGTGTTATTGTTGCCCATCTTCCACCACCACATCC 

CAAAAAAATAATTTCCACCCTCAAAATACCACATCTCCTTTTTTGTGTTTGAACTGTAAA 

TTATATATTAAGCCAAAATTATTAAATCTTTTTATCTTACTTCCTTACACTCTACATTGT 

ATGTTCCATTTACAGATAGTTTTATATATGGATAAGTTACAACCATTGCTGCAAATTCTT 

TTGGAGGAATAACTTTATAATAGACAGTAATTTTATTGGCAGTTTCTGTTATATTTATTA 

TCTTTATTTTATATCCAGCGGTTGGCATCTCTTCCAAGTTTATGACTATTATAGTTTTGT 

TATCTTTGTAGTTVATAATAATATCCCCTATTTTTCTCTCCAAAAGCTCCATAGGCAATTA 

TTTCAT7VATTTAAAGTGTTTAAAATATTGGTATTATTATTTACAATTTTATCACTGACAT 

TTTGGATAGTTTGATTTTTGGAAACATTTGAGTTTATACATGAATTATTTTTTATATTAT 

AATTGCCTATTTGGGTTTTTTCAAAAGAGATACAACCACATAAAGTTATAGAACAAAGAA 

TAGCAGTAAATAGAAACAATATAATTATTTTCTTTTTCATTTTCCAAACCTCCGCTGTCT 

TCTTTGGTAATCTCTAACCGCCCTTAGAAAATCCACTCTTCTAAATAACGGCCAATATAT 

ATCACAAAAATACAGCTCTGAATAAGAGCTCTGCCAAATTAAAAAATTACTAATTCTTTC 

CTCCCCAGAAGTTCTGATAATCAAATCAGGATTTGGAAATGGCAAATTTGCTGTGTATAA 

ATGTTTATCTATTAACTCTTTATCAATATCTTCTGGTTCTATTTCTCCTCTTTTAACCTT 

TTCAGCTATCTTTTTTACAGCATCTATTATTTCTTGCTGTCCTCCATAAGCTATTGCAAT 

ATTAACAAAAAATTTGTTGTAGTTTTTTGTTCTCTCTTCAGCGTATTTTATTGCTTTTTG 

AACATTTTTTGGCAATAGATTAATTCTACCAATTGCTCTAACTCTAACTTCATATCTATG 

AATTTCTTCATCATCTGCAATCTCGTAAAACTTTTTTTCAAATAATTCCATTAATTTATC 

AACTTCTTCCTTAGGTCTTCTAAAATTTTCAGTAGAAAAGGCATATAGAGTAACAACATT 

TATGCCCAAATCCCTTGCCCATCTTAAGACTTCTCTAACCTTCTCAGCCCCCAAGTAATG 

CCCGTAGTATCTATCTTTTCCATAAATCTCTGCAGCCCTTCTATTTCCATCCATTATTAT 

AGCTACATGTTTTGGTAAATTGTCTTTATCAATAGCCTCTTCTAAAATCTTCTCGTAAAT 

TTTTAAAACTCCGGAGTTGTCTAAAAATCTATAAAAATCAATTATTACTCTTTTTCCAAT 

ACTCTTTAATTTGTTTTTTATCTTACCCAAAATCCCCACCTATTAGGAATTTAATAGCGT 

TATAGTATCTCTCCCAATTAGCGTTTCTACTTGTATCAATAAAATTGACATTTTTATCTG 

ATAAAATAACTGCCCCATAACCAGGAATTTTGCAAATCAAACCTAAACATCTGAAAATTA 

AATTTCCAGTAATTCCATCTACAGCTATAATAATATTGTATCCATCTTTTAAATATTCCT 

CTATTAATATACCATTATGTATAATATCCACATTTCCTTTAAAATGCTCAACTATTTCCT 

CAGCTTCATATATTGTTTCATCCACTACTTTATTCCTTCCTAAATCTCCTAATCTTCCTC 

CAGAAAGGACTGCAACTTTTGCTTTAATATTATAATTTTTTAAAAAGTTAGATGCAAATT 

CTATAATCCTTATTTTATCTTTTATTCTCTCATTTTTGTCTTCTGATATATCATCAATCC 

CTACTGGAGATAGTAAAAAGATTCCATTAGTAAAGGGATTCTTTAAAATTGATGCCCTAT 

AAAATTTTCCTATTCTTTCTCTTAAATAGAGAATTACTTTTGATGAAGATAAAGATCCCC 

TAACAGCCCCATCTATCTCTCCATCCAATAGTTTATCTACTAAAAGTTTTGGATTGTCAA 

TTAATTCAACCTCTATTCCTTCTTCTTTTAATTTTTCATAAGCCTTCAAAACTTCTTCTT 

TATTGTCTCCTATGCCTATAGCATACATAATTATCACTTAAACTCCACTTCTATTCCTAA 

AATATCTCTCTTCCCTTTTAAAATATCCTCTGCTATCAAAGCCCCCCCAATAGCCCCACT 

TTCTCCATATAAGACAAATATCTTTGCCTCAACAAACTCTTTAATTCTTTTTGGAATATC 

TATCGGATTCCTTAAAGTCCCTATAGAACCTGCTAAAACCACTCTTCTTTTATTTTTATC 

CAATAAAGGTAATAAGCTATTTATCTCCATAGAGACACTTAAAATTAAGCTATCAACTGC 

CAATCTACAATTTTCATCATTAAAATAGTTGTTAATTATCTCTTCTTTTGTATTTTCAAC 

ACCTTTATAGAGCTTGGCTATTTTAACAGCCCCTGCCTTTGAAAATGCTTCATTTGCTGT 

AATTTTTCCAGCATCTATATCTCTAATCATTTCTAAATCTATAGGGCCATGTAACATTCC 

AATAGCTCCAATACACGCATCAAATCCTCCAAAAATCTTACCATCTTTTATTAATAAAGT 

TACAGTATTTGAGGATATATCGGATAAAACAAAATCATTAAATCCAAATAATTTATATGC 

ATAATAAGCTATAGAAACCTTTTCTGGAGATGCTATATGGGAGTATAAAGCTCTAAACCT 

CTCATCTAAGCATTCTATTCCTCTATGCAATCCTGGAATAACAACAGCTGGCAATCCAGA 

TTCTTTAATCTCATCATAAACCTTTGTTCCTCCTCCAACCTTTTCTCCAGCTCCTTCAAT 

ACTTAAAACTCCTCTATTTTTCACTTTTTCTATTGGTAGGATTTTGTTTATCCCATCTCC 

CATTGAGTAAGTTAAAGCAATCAAATCAATATCTTCCAATGAAATATGTTTCTCCAACTC 

CTCTAAGTAAGATTTTTCTTTGAGTTCTGTTCTCTTTAGTTTAAATATTATCTTTTTATC 

ATTATCTTTTATGCATGTAGTTATTCCCGACGTTCCATGGTCTATTCCAACGGTTATCAT 

AGTTTCACCAATAATTTATGCAATTCTCTTTATTTTATAGAAATCATTCCAAATTTCTTT 

TGAAAGGCTTTTAAAATTTCATTAAAATCATGATGTTCAAGTTCTCCCAAACCATATCCA 

TCAATATAATCTTGTATTTTTATAAATCTTTTTATTACAGCCTCTTTTGAGAAAATTATT 

ACTGAAAATGATACTCTCCAAGGGAAAGTAACTTTATTTAAGTTTATTTCATTGTCCCAT 
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ATTGGTGTTTGGTCATCATAATATGTCTCTTTTATAACTCCAATAGCATGAATCTCTCCG 

GTTTTTGCAATTTGGAAAACGGCAACATCAAAAGGTTTTATTTGGTTATATTTTCTTATA 

7UVACTTCTCCAATTCTTTTTCTGTTTCTCTCCAGCGTTTCTATCCCAAAATCCCCAAATC 

ATATGATTATAGCAGATTTCAATATTTCTTATATTGTTAGAGCTAAAGAGCCAATATGTC 

ATAACTATATCCCTCATTTTAAT7VAATTTTTAATGAAAATATTATACTATCAAATGTCAT 

CAATTTTGTTTAACACAAATTTTATATAATTAGGTAATTTAATTACCTTAAAAATGATTA 

AGATTGATTAGGGATAGGCATGGAGAAGTTCGATATTGCGATGACAGTGTTTTTGGTAAT 

GATATTCTTATTCATATTTTTACCAATTATTTATATGCTATCAAATCCCGGAGATTTAAA 

CCAATTGTTGGATAAAGAGGTTATAGAGGCGTTTAAAACTACTCTATTAGCTGGAGCTGT 

TGCTACTCTAATAGCTCTAATTTTTGGAATACCAACTGGCTATATTTTGGCAAGGTATGA 

TTTTAAATTTAAAAGCTTTGTTGAGGCTGTTTTAGATTTACCGATGGCAATTCCTCACAG 

CGTTATAGGTATCATAATCCTATCCTTCATTTATGGTATTGATATTATAAATTTTATTGG 

TAGATATGTAGTTGATAACTTTTGGGGGATTGTTACTGTCTATCTATTTGTTGGCATACC 

TTTTATGGTTAATAGTATAAGAGATGGCTTTTTAAGTGTTGATGAAGAGATTGAGTATGT 

CTCAAGAACCTTGGGGGCTTCAAAGATAAGGACGTTTTTTGAAATATCTCTCCCATTGAT 

AAAAAATAATATCATCTCTGGGATTATTTTGAGTTTTGCAAGAGGAATTAGTGAGGTTGG 

AGCAATATTGATAATAGCATATTATCCAAAAACAGTTCCTATCTTAATATATGAAAGATT 

TATGAGCTTTGGATTAGATGCTTCAAAACCAATATCTGTTGGAATGATTTTGATTAGCAT 

AGCGTTGTTTGCATTACTAAGGATGTTTGGGAGGATGAGAGGGAGATflATGCTTAAAGTA 

AATAATCTATCAAAGATTTGGAAAGATTTTAAATTAAAGAATGTCTCTTTTGAAATAGAT 

AGGGAGTATTGTGTAATTCTCGGTCCAAGTGGAGCTGGAAAATCTGTTTTAATAAAATGC 

ATAGCTGGGATATTAAAACCAGATTCTGGTAGAATTATTTTAAATGGAGAAGATATAACA 

AATCTACCACCAGAAAAAAGGAATGTTGGTTATGTTCCACAAAATTATGCCCTATTTCCA 

AACAAAAACGTTTATAAAAACATTGCCTATGGTTTAATAATA7VAAAAAGTCAATAAATTA 

GAGATTGATAGAAAGGTTAAAGAGATAGCTGAGTTTTTAAATATTTCACATTTATTAAAT 

AGGGATGTTAAAACATTAAGTGGAGGAGAACAGCAGAGGGTAGCTTTAGCAAGGGCTTTA 

ATTCTAAATCCATCTATTTTACTTTTAGATGAACCAACATCTGCTGTAGATATTAAGATT 

AAAGAAAGCATTATATCTGAATTAAAAAAGATAAAGCATATCCCAGTTTTACATATAACC 

CATGATTTGGCTGAAGCAAGGACTTTGGGAGAAAAAGTAGGCATTTTTATGAATGGCGAG 

CTTATAGCTTTTGGAGATAAAAGTATATTAAAAAAACCTAAGAATAAAAAGGTTGCTGAG 

TTTTTAGGGTTTAATATAATAGACGATAAGGCAATAGCTCCAGAGGATGTAATTATTAAG 

GATGGAAATGGAGGAGAGGTTGTAAATATCATAGATTATGG7VAAATATAAAAAGGTGTTT 

GTCAAATATAATGGTTACATCATTAAAGCTTTTACAGAAAGAGATTTAAATATTGGAGAT 

AATGTTGGATTAGAGTTTAGAGTVACAAACAAAATTT^CATGAAATTTTTTGGTGATAAGA 

TGATTGTAGTATCAGGAAGTCAATCCCAAAATTTGGCTTTTAAGGTAGCTAAGCTTTTAA 

ACACAAAATTAACAAGAGTAGAGTATAAAAGATTCCCAGACAACGAGATTTATGTTAGAA 

TAGTTGATGAAATCAACGACGATGAGGCAGTTATAATAAACACACAAAAAAATCAAAATG 

ATGCAATTGTAGAGACAATTTTgCTGTGTGATGCTTTAAGGGATGAAGGAGTTAAAAAAA 

TAACCTTAGTTGCTCCATACTTAGCTTATGCAAGGCAAGATAAAAAATTCAATCCTGGAG 

AGGCAATAAGCATTAGAGCTTTAGCAAAAATCTACTCAAATATTGTTGATAAACTCATTA 

CAATAAATCCACACGAAACACACATAAAGGATTTCTTCACMTCCCATTTATTTATGGAG 

ATGCAGTTCCAAAGTTGGCAGAGTATGTTAAAGATAAATTAAACGACCCAATAGTTTTAG 

CTCCAGATAAAGGAGCTTTAGAATTTGCTAAAACTGCATCTAT^TCCTAAATGCAGAAT 

ACGACTACTTAGAAAAAACAAGACTCTCTCCAACAGAAATCCAAATAGCTCCAAAGACAT 

TGGATGCTAAAGATAGGGATGTGTTTATTGTTGATGATATCATCTCTACAGGAGGAACAA 

TGGCTACAGCTGTTAAGTTATTAAAAGAGCAGGGAGCTAAAAAAATAATTGCTGCATGTG 

TGCATCCTGTTTTAATTGGAGATGCATTAAATAAGCTCTATTCAGCTGGAGTTGAGGAAG 

TTGTAGGGACTGATACATATTTATCAGAGGTTAGTAAGGTTAGTGTTGCAGAGGTTATTG 

TTGATTTATTAT/IATTTTTAAAATTTTT/^TTTTTTATCCTAAAAACCCAATAAACTTTC 

CTAAGCAATAAAATACACCAATAGATGCCCCTAAATTTGAGAGAGTGGCAACTAACAAGA 

CTCTAAATAAATTGTTGTTTAAGAGCTCTTTAATTGATTCAGCATTTATTATTCCCACTA 

AATCTTTATCTGTTATCTCTCTATACTTTAACTCTACAAGTCCAGCTATCGTCCCCACAG 

CCGCTAATGGTTVATGGGACGAGAGTAGTTATAGGGGCTGATAGAAAGGCAACTT^TGCAG 

TTATCAACTTCCCTCTTGCCAATAAAACTCCCAAGGCAGATAAGCCCCCAGTAAATAATA 

TCCATTGAAAAGTAATCATCTTTAATAATTCTGGATTATTTAGGGCGTAACATATCATAT 

ACAAAAAGATGCTAATTATAGTCAATGAAATACCATATGTTAAAAGCTTTTTTAATGATT 

TTTTTCTCTTTTTTACCTTTATTAATTCCATTAAATCAATATCATTTCCATTTTCAAGCT 

TTTTTAAATATCTTACAATTCCCTCAACATGTCCCGCTCCAACTACTGCCACCAAAGAAT 

TTTTATTCTTACTCAATTCAAATAACCTTTTAGCCATGAATCTATCTCTTTCATCTACTA 

AGACCTCATATATTGTTGGAGATATCTCCTTTAGCAATTTAATAAATTTTTCAGGATTTT 

TAACCATATCGTTTAATAAATCATCATCTAATTCCAAATCTTCCTCATCAGAATTTAATA 

GCTCCCAAAAAATCTTCATTTTTTCTTTAAATGTCATTCTATCCATTAATCTTGATAAAG 

TGATATCTATATCCCTATCAATTAGATATATTGGCAATCCATATTTGCTTGCTATTTCTA 

TAGCTTTTTTCATCTCACTACCTGGCTTTATTCCAAAACTCTCCCCTATCTTCTTTTGAG 
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AATTAGCTAAAATTAAATATATGAAAAATTTTAAAAAATTCCCTTCCTTTAATACTTTTT 

TTAAATCCACTTTTTTCTCTTCATTTGTAATTAATGAGAAAAATCTTCTATCATCAAGCT 

CTACTGCAATTCCTTCTGGAGAGACAGATGATATAATTTTTTCTACTTCTTCAATACTAT 

CCTTTGAAACATGAGCAGTTCCAATTAAATAGATATCACATTCATTAACTCCATTAAATA 

CTCTAACATGTCTCAAAATAATCACCATCTATTTGTAAAAAGTGTCGTTGATATTTTTGT 

AGAAATTATTATTAAATTGTGCCTTTAAATATTTAACACTMCTATTAATTGTTTAATTT 

ATCTTTTTCTTTTTTTAGTTTTTACACCTAAGAAAGCCCTTTTTATTATAATTGTTGCAT 

AACTTCCTTTTTCCAATTCATAGCTTAAAGTTATTTTATATTTTCCTTTATTCAATTCAT 

CCTCTTCAAACTCTCCAATTTTTAAGTTTTTAGGGATTGAAAGAATCTTTCTTTCACTGT 

ATATGAACTTCCCTAACTCTCCTATATTATTTAGCTCTTCCATAGTAAGGCCTTCTCTCT 

TTAAGATTTCTTCAATAATTTCTTTTTCTTCTCCACTATATTCAATGTCTGGAGCTATTG 

TTGGAAATTTTTTATCTTTCAATATATTAAACACTTCCTCATCCATTTTTTTATAGAACA 

TAAGGGTTCCACATTCATATTCATAATAAACCCTATCTTCTTCTGGAACATATTTTCTTA 

ATAACTCTTTTACACACTCATTCCATAGATAGCTTTGATAAGCAGCAACAAAAATTTTCT 

TCAGCCTATCATCAACATAACTTAAAGCTTTTTTATAATCATTGCTTTTTTTAAGCTCTT 

TAACCATATTCACATATAATCTTGACTTTATATTATTTTCCTTAATATACTCCCAAATTT 

TATCCCAATCTCCCCAGTTTTTATCTATAAATCTCTTTAAATCTTTTATTAATTTCTTTT 

CAGATTTTTTATATTTCGTTAGCAATATTTTCACAGCTTCTTCATAATTGCCTTTTATAA 

CTTCTTTGGCAATGAATTTTTTATCAAAAACGCTTCCAAATCTCTGACTATCAAAATAAT 

TTGGAGCTCCAAATTCTAAGTATTTTAAATTTTCTTTTATTTTTGGGATGTCTTCTTTTT 

TTAAACCCCTAACTGTTATTGTGAATCTATTTCCCTCTAAATCTCCCAACAATAGAAATT 

TTGATTCTCCGATTAACTCTAATTTTAAATTTGGTTCATCTAAGCTTAATTTTCCATATT 

TTTTTGGTATAGATATATATTGAGTAGTTAAAGCATGCCTATCTTTTAATCCACAGTATC 

CAATATCCTTCAATGGAATTTTAAATTTTTTTGCAATATAAGAGAATGCTTTCAAACTCT 

CTATATTTCTCTTTGTTAATTTATAAAGGTAGCATCTATCTCCAGCTATTTTATTAAAAT 

CAATAATTTCTTCAACGATAAAATCCTCTGGCTTCATTCTAAGTTTCATAAAAGCACCCC 

AACAATATAAACTTCTATTATTAAACTTAAATTTAAAAAAAGACTCTTTGGTTGAAATAT 

TTTTCATAAAAAGACTTGAAAATTCACAGGAATTAGTTCCACAGAAAAAATAACCTAAAG 

GAATTTTTAACTTTCTTGGGTAATTTTTTAACTCTAAAATAGATGACGCGGGGGCCGGGA 

CTTGAACCCGGGCTGGGCGTTGCCCAATGGGATTAGCAGTCCCACGCCGTACCAGGCTGG 

GCCACCCCCGCAATAAAAGCAAACACTAATTGGGTATAAGGTATATATATAGTTTTCGTT 

TTTACAATACAAGATATAGAAAATTAAAAACTATTTCGAACCCTAAAACATACAAAATAA 

AAACAAACTCATAAATTCTCTTAAAAATAAAACTTTAAAATTGAAAAATTAGTAATACTT 

TTTATTAATTTTCCAATACCAAAATCAAACAACCTACTTATAATCTTAAAAATCCGAAAG 

ATTTCTAAAACCTGTTCGCTATGCTCACAAGAAGCAAGAAATTAAATTAAAAAATCTATT 

ATGCATATTAAAATTCTCAATAAAGCATAATCTATTTATATTTTATACATCACTATTTGT 

CATTAATGATAATGATAAATTACTGGTGACAGTGATGATTAAAAAAATCGCAAGGAAGAA 

GTGTATTCATGTAATGCTTGTTTATACTGGTGGATGTAACGCTTGTGATATTGAAGTTGT 

TAATGCTATATTCTCTCCATTTTATGATGCTGAGCAGTATAATGTTTTTTTAACATTTAA 

TCCAAGAGAGGCAGATATTTTAGTTGTTACTGGTTGTGTTACTAAAGTTGTTGCAGAATC 

ATTAAGAAAAATTTATGAGAAGATTCCAGAACCAAAGGCAGTTGTTGCTGTAGGAGCTTG 

CGCATTGATGGGAGGAGTTTATAAAAACATTGGAGGAGATTTAGGAACTTCAGATTTTGT 

TGCAGGACCTGTTGAAAACATTATTCCAGTTGATGTTAAAGTGCCTGGCTGTGCCCCAAG 

ACCAGAGGATATTATTGCTGGGATAGTTAAAGCTCTACCTAAGGTTATCGAAGGAAAATG 

AGGTTTTTATAAAATTTTATGAGTGAGAATGATTTATGTTTGTAAAATTTCTTTAGTGAG 

GGATAGGTTATGATTGATGAGCTAATATCCATAATTGGCATTCCGGCTTTAGCATTTGCA 

^^^^"^^^'^^TATTCCGGGAATTCAGAGAAAGATAGAGGCAAGGATACAACAAAGAATA 

GGGCCGAGTATATTAGCCCCAGGATTTTGGGCATTTTTTAAGTTTTTATTTAAAGAGArA 

AAAGCTCCTGATGCAAATTTGCCAAAACTATATAATTTGCTGCCTTTGTTGTCTATAGTT 

GTGTTGTGGGCATTGTTGTCTATAACATCATTAACATCCTTCCATATATTATCTAACGAG 

attggtattgttggattgctgaagttggaggagatgatgtatgttatattaggttct™ 

GCATTTTCAATTATGGGCTGGAAAATGCCGTTTATAGATGAATGCAAAGGCACACCGTTT 

^^^^^^^^^^^^^'^^^^^^^^^^^^ttaggagctgtaagaagctttaaaatgata 

ACTATAGGTTCATTTCCATTTTATTTAGCAACATTTTTGCCATTTGTTCAAAAGA^ 

atattcttaaaagatattgttggagaaccatttttattctcattggctgggatatttgga 

gctgcgtgttatttcattggatatgtgataatgattaaagaatatccattctcaataaS 

cacacaaaggcagatgttattgaaggtcctacaatggaattaattgcaaaatatagagct 

ttatatttagcaagtaaggaacttttgttaatagctttaggaagtttatttSS^^ 

tacttaggaatagctccagatatagagaatcctataacaatagttgaaaactttgctat^ 

gctttgatattccctatattggccacatttgttagggcattttcgccagtacttttatt? 

aaacagatatatcctatctcctatgtggcaacactaattggtgttattggctttatattt 

gcattgcttggatggtaaagtatttcagaaaatatctaatgagttatgaSSg^^^ 

aaacagcaataataattattttaaaaattatctttgagattctggttStac^^ 

aattttaacaactactactggagtatgctccaatttttccaaaatctcttcagtta^^ 
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ATAGGTGTTGTATTTTTCTAAAACTTCATTAATGTTTTTTATTGTTTTATGCACTCTATC 
ACAAATAAAAGCTATCCTTGTAAGCTCACCAATAGAATACTTTTTATATACCTCATTTCC 
AGCTATTGCTAATGGTGTGCATCCAGCAAATCCACCAATCTTTAAAACATCTGCCAACCT 
TCCAAGAATGAAACCAAAAACTCCAATACCACTTATTATTCCATCAATTGCATAGGGAAG 
5 AGCGGTAAAATAGAAATTCAAAACTCTATAAGTAGCATCAGTATCGGCAACCATAACAAC 
AACATCAACACCTAATTCTTTTTTTATCTCTTTATATAACTCCTCAGCCCATTTTTTTGG 
ATTTTTTGGGAGAGGACAGGCATAAGTCCCAGGGACGTTTGTTAAATCAACTCCTCCTTC 
AGCATAAGGTTTTAAGGCATATCTTAATCCAACTATTTCTATAATTGTCTGTTTATGCTT 
TAAAGTCTCTTCCTTTGGCATTCTTCTCAAATTTTTTATTTTATCTTCTTTAACTTTCAA 

10 

TAACTTTCCAAGCACATAGCCCCAAAGATATTTAGACCAATAATAGCAGAGATAAGCTAA 
AACTCCTGGTTTAAATTTACTCTCATCAATAAAATTTCCCTCAGCAGTTGA/^CCATCTT 
TTCACTTAATACTACAAAATCTCCATCCTCTAATTTAATTCCACTATTTTTTATTGCTTC 
AACAACAATTGGGATAAAATTTTCTCCCCTTTTTATGTATCTTGTTTTGATGGGATAAGC 
TCTCATATTCTCACATGTTAAAGTGTATCACCGTAGCAATATTGAATAGGATATTTATAA 
15 ATATGGCTAATTAATAAATTATTATTTTGTTAGATAAAATCAAATTTAATTATGTGGGGG 
AGTATGCCAAGAAGGAAAATAGACAAATTGTATGTAAAAATCTATTTCGAAGGTAATGCA 
ATAGAAGGTGAATATGATTTTGACGCAGTTACACACTTAAAAAATGGCATATTAAAATAC 
CTATGGACTGGAAAAAAAGACCCAATT^TAATTTGGAATATGGATAATAAGTCATTTACC 
ATTATTGACCCATCAAAGATATGTGCTGTAGAGGTACAGGGTTCATTAATGTTCTTAGAT 

20 GACATCCCTGAAAAGAAATTGGAAATGAAGTCATTTAGTGAGAGAGATTAGCTTCCTTAC 
TTAATTAATAATTAAAAATAATTAAAAACAATAACTACTAAAGGTGAAAACATGAAAAGA 
GTAAAAACTGGAATTCCTGGGATGGATGAAATCTTACACGGTGGMTACCTGAAAGGAAT 
GTTGTTCTATTATCTGGAGGGCCTGGAACTGGAAAATCCATATTCTGTCAGCAATTTTTA 
TACAAGGGGGTTGTTGATTACAATGAACCAAGTATTTTAGTAGCTTTGGAGGAACATCCT 

25 GTTCAAATTAGAGAGAATATGAGACAGTTTGGATGGGATATTAGAAAGTTAGAGGAAGAG 
GGAAAATTTGCTATAATCGATGCCTTTACATACGGAATAGGAAGTGCTGCAAAAAGAGAA 
7VAATACGTTGTAAATGACCCAAATGATGAGAGAGAGTTAATAGACGTTTTAAAAACTGCT 
ATAAATGATATTGGAGCTAAGAGGATAGGAATTGATTCAGTCACTACCCTATACATAAAC 
AAGCCAATGCTGGCAAGAAGAACTGTCTTTTTATT7VAAAAGAGTCATCTCTGGTTTAGGA 

30 TGTACTGCTATCTTCACTTCTCAAATATCCGTTGGAGAAAGAGGATTTGGAGGACCAGGA 
GTTGAGCATGCAGTTGATGGGATTATAAGATTAGATTTGGATGAAATTGATGGAGAGTTG 
AAGAGGAGTTTAATCGTATGGAAGATGAGGGGAACAAGCCATTCATTAAAGAGGCATCCA 
TTTGACATAACCAATGAGGGAATAATTGTATATCCAGATAAGGTATTGAAGCTTAGATAA 
AATTTTAAGGGAGAGGATGGAGTCATTTATCTTAATTTTATCTTAATTTTATTATTTTTG 

35 GGAGTAGTGTTTGCTTTTGGATTTTATTTGATATTCATAAAGCTTACTGGATTAAAATTG 
ATGGATTATTTTCCAAGATTTAAAGAGAATAGACTAAAAATGATTTTTAGTATTTTAAGT 
GTGAnTCTTGCCTTTCTCATAAATTGGTTGATTATGAAAAATTTTAGTTTTTTGATTGAG 
ATAATTCATCCAATTGCATCAGTCTGGATATTTATTATATTGATATATTTATTATTAAGA 
TTCTTATTCCTTAAAAGAGTCCCATTATCAAATTATGAAAAGAAATTCATGGGAAATATG 

40 TCTGCAATAGCCATATTTCTTGAACTTTTAAAAATTATCGAATATGTGGATGAGCATAAT 
ATTGCCTCTCCAATAACAGTTGCTTTAGTGTTTTTTATCCCAGTTGTTGTTTTTTTTAAT 
TGCAAGTATTTTTATGAAATGGAGTTGTCAAGTTAGCGATTTCATCCAGTTGTAGAACAT 
CATGAAGCTTTTTATCCAACTAACAACCATTAGGTTTTGTAAATACAGCAAAAAGGTTAT 
ATCCTATTAGTTGTTATCTATAAAATATGGCAAAATAGGGGGGCTGGTAGTTATGGAGAT 

45 AAACAACTTTTACATCGGCTTTATTGGATTAGCAGTATTTATTTTTGCTATTACTATTAT 
GTTCTATATTTGGGCTTTTAAGTTTGATAAAAAGTATTTGGCTAAGGAGTAGTAGAACCA 
TCTTTTTTAATCCTTAAAACCATVAAATTAATT^TTAAAAGATTATCATGTGAAACCATGG 
AGACGTCAAAGAAGTTAGTTATTGTTGCAGTTCTCTCAATAACATTAATTTTAACTTATG 
CCTATTTAATAAGCATAATTGAGGGGGTTGATTATTTCACAGCTCTATATTTCAGTGTTA 

50 TTACAATAACAACCACAGGTTATGGAGATTTTACTCCAAAAACATTTTTGGGGAGGACAT 
TAACTGTAGTTTACCTATGTGTTGGTGTGGGAATAGTGATGTATCTCTTCAGCTTAATAG 
CGGAGTTCATTGTTGAGGGGAAGTTTGAAGAGTTTGTGAGGTTGAAAAAGATGATVAAATA 
AGATTAAAACTTTAAAAGACCATTATATTATCTGTGGATATGGAAGATTAGGGAAGGTTG 
TGGGGGAGAAGTTTATTGAAGAGAATATCCCATTTATTGCTATAGATATTAATGAAGATG 

55 TCCTAAAGGAAGAGTATGAAAAATACCCAGATAAGTTTTTATACATTGTGGGGGATGCTA 
A7VAAGGAGGAAGTATTGAAAAAAGCAAAAATTGATAAGGCAAAGGGATTAATTGCTACTC 
TTCCTTCTGATGCAGATAATGTGTTTTTAACCTTAACAGCAAGAGTIATTAAATCCAAACA 
TTTTAATTACTGCTAAAGCAGATGAGAAGGAAGCCATAAGAAAATTAAAAATAGCTGGGG 
CTAATAGAGTAGTGTCTCCGTATTTAATTGGCGGATTAAGAATGGCTGAGGTCTCTGTTA 
GACCAGGGATTTTGGACTTTTTGAGCACATTTATTAAGATAGCTAAAGATGAATATGAGG 
AAGATATTGAGTTGAGAAAGTTTGTCATTGAAAAAGATTCTGAATTAGCATATAAAAGTT 
TAAAAGATGCGAATATTAGAGGAAAAACTGGGGCTACT^TCTTAGGTATTCGAAGAGAAA 
AGGAGTTTTGTATAAATCCTTATCCAGAGTTTATTCTAAAACCTGGTGATGTAATATATG 
CATTTGGAACTGAAGAAAACTTAAAATATTTGGAAAATCTTGTTAAAAAGAAAAAGAAAA 
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AGTTATAATCCCATCTTTTTTATTCCCAATTTAACGGCATTCTTTTTTAGGTTTTGGTTT 

ATCCCAATATAATCTAAAACCTTGCTCTCAAAGTTCATGTATGCAGCAACATTTAACAAT 

ATAACCCTCTTCAATAAATCGTCTAAAATATCAACAATCTCCTGGGCATCTTCTTTTAAT 

TCCTCTTTTAATTTATTTTTCAATCTGTCTTCAATTTTAATAATTATTGCAGTTAAATAC 

TCTGGTAGGAAATCTTTCTCTACGTATTTAATTCCTCTTTTTACAATTTCTTTATAAAAT 

TCATTTAAATCACTATCATTGGATATTTCAAATAAACCCTTTATCCAATTTTTAAAATCT 

TCCTCAAGTTCCTTTCTTTTACTCTCATCAAATAATGTCTTTGTTTTTTCATATGAGAAG 

ATATCATTAAATACCTCTTCAACCACTTCGTCTATTGTCTTACTAATTAAGTCCTTATAT 

TGTGTCAATTTAGAAAAATCCTTCTCTTCAGATGCAAAGTGATGCATATTTTCAATAATT 

TCGTTGTATATTCCATCGAATGTTACGTTCATTACTACCCCCTCCTCTATGTTAATTTTT 

AATTTGAATTATGGGTATATACAAATATAGTGTTATTTTATTGTTTAATATTATTACTTA 

TTTAGTTTAGTCTGGAGTTTTTTAAATAAAAAATTAAGAATAATAAGTTTCTATTTAACT 

GCATCTACTAAAAATATGATTTAGAAATGGTATAAATACTTATTGGTTATTGGTAGAAGT 

TTAGATAAGCTTCTACCAATTTAACTCCAGCGTCTGTTAATCCCTTCTTCCCTATGAATC 

CAACATTCTTTAATATAACCAAGTTCTTTTTAATCTCCTCTGGGGTTAAAGATAGATACT 

TAGCTAACAATACAATTTCATTTTCTTTATGTTCTTTACCCTCACTCTTCTCTTTCCAAA 

CTTTATCAAACTCTTCTTTATGGTCATAGATGGTTTTTAATATGTTGAATGTTGTTGGAG 

TTACAGCAAACTTTGTTGCCAACAATTCTTGGATTTTCGCCATTTCTATAGCTGTTTTAA 

CCTCTTCCCCTAATTCTGTTAATTTAATCATCTTGTTTTGCAACTCTTTAATGAATCCTT 

TGGATTCTGCCTCACCTAAAGATTTAATGATTTCTTTCTCATCTCCACCAACATGACCTT 

GGATTAACTTAACTAATTCATCTCTGTGGATATATCTCTTTCTTGGAATCTTAATTAGAG 

TTGAGATATCATATTTTGTTAAGTATGGCTTTCTTTTAATTGTTTTTGATAATTTTATTA 

AGTATTTTCCTTTTTCTGTTATTCCTCTTTTTACAGTTCCTTCTTCTTTACCTTTAATTA 

CCCATTCAGCAATTGGCACATCTCCACTTTCTGGATAGGTTATAGCTTTCATTCCATCTG 

TTGATACCTCTCCTAAATCTTTAACCTTCTCTCCAAACTCTGTCATCCAATATGTGTCTT 

TATTTTTAACAACTTCTCTCTTTATTAACTCTTTTGATTCTAAGGTGTGTAATACAGCCC 

CTAAGTCATCAATATTTGTTCTCTTTTTAATTTCATCGTAAGTTGGTATAATTTCAGGGT 

TTGTTTCATACTTTTCCTTTATTTCTTCAATTGCCTTTAAAACCTTAATCTCATCCTCTA 

AGACATAGATTGGGAGGGTTTTCTCCTCAACCTTACCCATCTCTTTATAAGTGTCCATCA 

TTGCTTGTCCTAATTCAGTAACTTTTCCTTCAGCATAGAAACCGCTTTCTGTCATCTCTT 

CAGTAGTTTCTCCTCTCTCCAAAATTTCAAAGTCCTCTTTTCTCAATATTAAAGCCCTAC 

TTAAGTTTGGAACTAATGAAGCTACTTTTAAAACATAGTTCAATGCCTTTGTTGTAGAAA 

ATGCCTTTCCACTTTCTGTTTTTGGTGAAATTAAAAGCAATCTCATAGCTTGGAGTGCAT 

TAATTATGTTATCTCCATATTCTTTAGTGTTTTTGTAAGTTATTAGCTCATCATAAACTC 

CAATCTTTGGCATATCTTTTATAAATGCCAATAATTCAGGAGTTAGATAAACAACTGGAT 

GTGTCTCTCTATATATCTTTAAAATCTCTTTTCCAATCTCTGTTAAACCATTTTCATCTG 

CTAAGAATCTCTCTTTTAACATTAACATCCAGTCCTCTGGAACATTTCCAGTTTCCTCCA 

ACAATTCCATAATTTTAATAATCTCAGAATCTACAAATATATCTGGAATCTTTTCTAAAT 

CAATTTTATCAACAATCTCCATCAATTTTTTACCAGCTTCAGTGAATATTATCTTATCTC 

CTTTTAATTCAGCAAATCCTAATATAAACAGCTCTAAAGCTCTTGTTTTAAACTCTTCTG 

GTAGAGCTTTTTCTATCTCGTTCTGCATTTCTGTCTCTTTCATCTTTTTTAATATTTCCA 

AGTGTCTCTTTTTTAGGAACACGATATCACCTCATCACTATTTATTATTATTTTTGTATT 

TATTTTTATTGTGTTGTAAGTTCTTCAATAAATGCTTTATTGAATTCTTTTATTGCTCTA 

^J^J^^^'^^^^CAACATTATCATAAAAACTGTTGCTCCAAAGAATGTAACCCATACA 

^o^^^^^ '^^■^^'^^^^^^^^^■^■^'^^CGTTAAAGTTACTAAGAACAAAATTGAATAT 

CCAACATCAATTACATACCTGTATTTTTCCATCTCAAAGAGAAATATAGAAGTTAGTGTG 

AATGTTGTTATTTTCTCTAAGGTATTTGTTAAATGACTTTTATTTATTATTTTATTTTrA 

AGTTCTTTCACATCTTCCCCTTTTTCATAGGCTTTGCTAATCTTTCTTATTATTTCATAT 

nnl^J^^^^^^^^^^^^^^'^^^^T^^^^T^CTAAAAAACCCATGGCTGAGAGAATT 

CCAAGTATTATTCCTACATCCATTATTTTCCCCTATCATATTTTGTTCTAATATTGCTAA 

TTTATATTCATTTTTTACTAATTAAAGTTCTCACTTTTTTATGTCTATGAAGTTCTAT^ 

AAACTTTTCTATAATCAATTAATTTAAATATGTTTAGAAATTTATAAACATAAAAATTAA 

AAAATAGGATAAAAATTTACAGTTTTTAAACTGATATAGCACCCGCCACCCTGCGAACCC 

AAATTTAAAAGAATTAGGTGAAACCATGGAATTTAAGATTGTAAATACTATCTGCCCTTA 

I^^J^^^G'^C^TCCAATAAATGAAGGAAAGTTATGTGCTAAAGGAAATTATTGCTA 
J^I^^^'^^^^^^^GT^^GGATAGATTAACAAAACCATTGATAAAAAAAGAAAGTGGTTT 

tgttgaaactacatggaataaagctttagaagtaattgcagaaaatttaaagacctatII 

^rrI^r^^SS?^I^''"^'^''''^^^^'^'^TT^'^CCATTGTGCAAGGTTGTGACA 
CCCATTAATCGCAAGAAGAATAATGAGAGCCAAAGATAAAGGAGCAAAAATAATAGT?A? 



wo 98/07830 



PCT/US97/14900 



-196- 



AGACCCAAGAAGAACAATAACTGC7VAAAAACTCTGATATATATCTAC7UVATAATTCCTGG 
AACTAATGTTGCCTTAATAAACGCCATGATTAATGTAATTATAAAAGAAAATTTGATAGA 
TAAAGAATTCATAAAAAATAGAACAGAAGGCTTTGAGAAATTAAAAGAAATTATTAAAAA 
ATATACACCAGAATATGCATCAAAAATATGCGGAGTTGATAAAGMCTGATAATTGAGAG 
5 TGCTAAAATTTATGGAAATGCTGAAAGGGCATCTATCATATACTGCATGGGAGTAACACA 
ATTTACACACGGTGTTGATGCTGTCAAGGCATTGTGTAATTTAGCCATGATAACCGGAAA 
TATTGGTAAAGAAGGAACTGGGGTTAATCCATTAAGGGGGCAGAATAACGTTCAAGGAGC 
TTGTGATATGGGAGCTTTGCCAAATGTATTTCCTGGGTATCAAAAGGTTGAAGATGGCTA 
TAAATTATTTGAAGAGTATTGGAAAACTGACTTGAATCCAAATTCTGGTTTAACAATACC 
10 AGAGATGATAGATGAATCTGGAAAAAATATTAAATTCCTATACATAATGGGAGAAAATCC 
AATAGTATCAGACCCGGATGTTAAGCATGTTGAAAAGGCATTAAAAAGCTTAGATTTTTT 
AGTAGTTCAAGATATATTCTTAACTGAAACTGCAAAATTGGCAGATGTTGTTCTTCCAGC 
TGCATGTTGGGCAGAGAAGGATGGAACTTTTACAAACACTGAAAGGAGAGTTCAATTAAT 
AAGAAAAGCTGTAAATCCACCTGGAGAGGCTTTAGAGGATTGGATAATAATCAAAAAATT 
15 AGCTGAAAAACTTGGTTATGGAGATAAATTTAACTACAATAAGGTAGAGGATATATTTAA 
CGAGATTAGAAAAGTTACGCCTCAATATAGAGGCATAACCTACAAAAGATTAAAAATTGA 
TGGCATTCATTGGCCTTGTTTAGATGAAAATCATTCAGGAACAAAAATCTTACATAAAGA 
TAAGTTTTTAACAGATAACGGTAGAGGAAAGATATTCCCAGTTGAGTATAGAGAAGTTGC 
AGAACTACCAGATAAAGATTATCCTTTCATTCTAACAACTGGAAGAATAATATTCCACTA 
20 CCATACTGGAACCATGACAAGACGATGCAAAAATTTAGTTGAAGAGATTAATGAACCATT 
TATTGAAATAAATCCAGATGATGCCAAATCATTAAAAATTGAGAATGGTGATTTAGTTAA 
GGTGATTTCAAGGAGAGGAGAGATAACTGCCAAAGCAAGAATAACTGAAGACATTA7VAAA 
AGGAGTTGTATTTATGCCATTCCACTTCGTTGAGGCAAATCCTAACGTATTAACCAATAC 
TGCGTTAGATGAGTTGTGTAAAATTCCAGAGCTTAAGGTGTGTGCTGTAAAGATTGAACG 
25 AATTTAATTTATAGAATTGTTTATATAATAGGAATCATATTTCCTAATGTTATGGGGTGA 
GAGTATGGAAGAGATAGTTAATAAGATTACAAAATTTATCAGGGAGAAGGTTGAAGAAGC 
CAATGCCAATGGAGTTGTTGTTGGATTAAGTGGGGGTATTGATTCTTCTGTTACAGCTTA 
TTTATGTGTTAAGGCACTTGGAAAAGATAAAGTTCTCGGCTTAATAATGCCAGAGAAGAA 
TACAAATCCAAAAGATGTTGAACATGCAAAGATGGTTGCTGAGAATTTAGGAATAAAGTA 
30 TATTATCTCAGATATAACAGATATCTTAAAGGCATTTGGTGCTGGAGGTTATGTCCCAAC 
GAGAGAGTTTGATAAGATAGCGGATGGAAATTTAAAGGCAAGGATTAGGATGTGCATCCT 
CTATTACTTTGCAAATAAATATAATTTATTAGTTGCTGGAACTTCCAATTVAATCTGAGAT 
TTATGTTGGATATGGAACAAAACATGGAGACATTGCTTGTGATATAAGACCAATAGGCAA 
TTTATTTAAAACAGAGGTTAAAAAACTTGCTAAATATATTGGTGTTCCAAAGGAAATTAT 
35 TGAAAAACCACCATCAGCAGGGCTTTGGGAAGGACAGACAGATGAAGAGGAGCTTGACAT 
TAAGTATGAAACTTTAGATACGATATTAAAGCTTTATGAGAAGGGCAAAACTCCAGAGGA 
GATTCATAAAGAGACAAACATTCCATTGGAAACTATTAACTATGTGTTTGATTTAATTAA 
AAAGAATGAGCATAAGAGAACTTTACCTCCAACACCAGAGATTTAATTTTTAATTTTAGT 
TTAAATATTTTATTTTAGTTATTCTATTTTAAAATTAAATTATTTTATATATTGTAATAT 
40 TCCAAATCATAAGTCTCAGACCATAATTATTTAAATATAACTTCAACCAATATTTAGAAA 
ACCCAAAAAACTATCTCTTTTATATCTCTACGGAGGGTTGTTCATGTGTGGTATTATCGG 
TTTTATGAGTAGAAAAAAAAGAATGATAAAAGGGGATAAGATAGCGTTAGCGTTAGATAG 
TCTAAAAGAGAGAGGTAATGGGAAGGGTTCTGGTTATGTAGGTTATGGAATATATCCAAC 
AAAGTATAAAGATTGCTATGCATTCCACATTTTAATTGACAACACACCAAAGTTTGAGAA 
45 AATAAAGGTAGAGGTTGAGAATGTCTTAGAGCAGTATGGGACAATAGTTAAAGATGAGGA 
AATACCAACAGAAGATGGCATTATAGT^AAAAACACAAATTCCTTGGAGATACTTTTATGA 
AGTTGATGAAAAATTTGCTGATAGAGAGGT^GATGTTGTCGTAGATATAGTTATGGAGAT 
TAATGACAAAATAGATGGAGCTTTTGTCATTTCAAGTGGTAAGGATTTAGGTGTTTTTAA 
GGCAGTAGGATGGCCTGATGAGGTTGCTAAATTCTATAGAATAGATAAATATGAAGGTTA 
50 TATGTGGTTAGcACATGCAAGATATCCAACAAACACAAGAGCATGGTGGGGAGGAGCTCA 
CCCATTCAATTTATTAAATTGGAGTGTAGTGCATAATGGAGAGATAACAAGCTATGGAAC 
AAACAAAAGATTTGTTGAAATGTTTGGTTATAAGTGTAGATTATTAACCGATACTGAAGT 
TGTTGCCTATATATTAGATTTATTGATGAGAAAACACAAAATCCCTGTTGAGTATGCCTT 
ATCTGCTTTAGCACCAAGATTTTGGGATGAAATAGATAAGATGCCAGAGGAAGAGAGAGA 
55 GTTACATACAGCAATAAGATTGGCTTATGGAGGAGCTATGCTAAATGGTCCTTTCGCAAT 
AGCAGTTGGAACTCCTCAAGGTTTAATCTTTATGAATGGAGATATTGAGAAAGACACAAC 
AATGTTTGGTTTAACAGATAGAATTAAGTTAAGACCATTAATTGCAGCTGAAAAGGATGA 
TATGATATTTATTTCAAGTGAAGAATCTGCTATAAGAAGAATCTGCCCTGACTTAGATAG 
AGTTTGGATGCCTGACGCTGGAATGCCTGTTATAGCAAGACCTTGGAAATAAACAAAGAT 
60 TAAAAGATTAAAAATAAAAACATGAGGAAGTGAAATCATGATTCCCAGCTATGTGCCACC 
AAAGTATAAAGTAGAGGTTGACCCAAACAGATGTATGCTATGTGAGAGATGTACAATAGA 
GTGTTCCTGGGGAGTTTATAGGAGGGAAGGAGATAGAATTATTAGCTACTCAAACAGATG 
TGGAGCTTGCCATAGATGTGTTGT7VATGTGTCCAAGGGATGCAATAACAATT?jy^GAAAA 
TGCAATATCTTGGAGAAGCCACCCATTATGGGATGTAGATGCAAGGGTTGATATTTACAA 
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TCAAGCAAAAACCGGCTGTATTTTATTGAGTGGGATGGGTAATGCCAAAGAACACCCAAT 

CTATTTTGATAAGATTGTTTTAGATGCATGCCAAGTTACAAACCCATCCATCGACCCATT 

GAGAGAGCCAATGGAATTAAGAACTTACATTGGTAAAAAACCAAAGCAGTTAGAGTTTGA 

ATTTGTTGAAGAAGAGATTGATGGCAAGAAGATTAAAAAAGCTAAGTTAAAAACAAAAAT 

AGCTCCAAACTTAAAGTTAGATACCCCAATAATGATTGCCCATATGTCTTATGGAGCTTT 

GTCTTTAAACGCTCACCTATCATTTGCTAAGGCAGTTAAAGAATGTGGAACATTCATGGG 

AACTGGTGAAGGAGGATTGCCAAAAGCTCTCTACCCTTATGCAGACCACATAATTACCCA 

AGTTGCAAGTGGAAGATTTGGAGTTAATGAAGAGTATCTTATGAAAGGTTCTGCAATAGA 

GATTAAAATAGGGCAGGGAGCTAAGCCTGGAATTGGAGGGCACTTACCTGGAGAGAAGGT 

TACAGCAGAAATTTCAGCAACAAGAATGATTCCTGAGGGAAGTGATGCTATCTCACCAGC 

TCCTCACCATGACATTTACTCAATTGAGGATTTAGCTCAATTAGTTAGAAGTTTGAAAGA 

AGCAACAAGATGGAAAAAGCCAGTGTTTGTTAAAATTGCAGCTGTCCATAATGCTCCAGC 

TATTGCTGTTGGAATAGCAACAAGTGATGCTGACGCAGTTGTTATAGATGGATATAAAGG 

AGGGACAGGGGCAGCACCAAAGGTATTCAGAGACCATGTTGGAATCCCAATAGAAATGGC 

TATTGCCGCAGTAGATCAAAGATTGAGAGAGGAAGGTTTGAGAAATGAAATTAGCATCAT 

AGCAAGTGGAGGAATCAGATGTTCAGCAGATGTATTTAAGGCTATAGCTTTAGGAGCAGA 

TGCTGTCTATATTGGAACTGCTGCAATGGTTGCTCTTGGCTGTAGAGTTTGTGGAAGATG 

TTATACTGGATTGTGTGCTTGGGGAATAGCAACACAAAGGCCAGAGTTGGTTAAGAGATT 

AGACCCAGAAGTTGGAGCAAGAAGAGTAGCTAACTTAATCAAGGCATGGACACATGAAAT 

TAAAGAACTCTTAGGAGCTGCTGGAATTAACTCAATTGAAAGCTTAAGAGGAAACAGAGA 

TAGGTTAAGAGGAGTTGGCTTAAATGAGAAGGAGTTAGAAGTTTTAGGAATAAAAGCTGC 

TGGAGAATAAATAGAACTTTCACAAATAAAAATACTTTATTGAAGGGTGATGCCTTTGGC 

ATCTAAATTCCAAAATCAGCATATAAACTGTGAAAGTTCTATTTAAATTTTTT7VATTTTT 

AAAGGTGAAAGGCATGGAAGAGGTTGTTATAGATGCAAAGGATATGCACTATAGAGAGCT 

GAATGAAAAAATACATGAAATTTTAAGGGAAAATCCAGACATTAAAAAAATTGTCTTAAA 

AAACGTTTTAGGGCAGAGGTTTATTGCCGATGGAATACAGAAGAAAGATTTAACTATAGA 

GATTTACGGCATTCCTGGTGGAGATTTAGGAATGTTTATGAGCGGCCCTACAATAATAGT 

TCATGGAAATGCTGAATTTGCTCCTGGAAACACGATGGATGATGGAACAATAGTTATCTA 

TGGAAGTAGTGGGGATGTAACCGCCCACTCAATGAGAGGAGGAAAGGTTTTTGTTAGAGG 

GGATGTTGGTTATAGAAGTGGAATTCACATGAAAGCTTATAAAGATAAAGTTCCAGTTCT 

TGTGATTGGTGGAAGAGCTAAGGATTTCTTAGGAGAATATATGGCTGGAGGTATTATAAT 

TGTCTTAAACATTGATGAAAAAGGAAATGATTTAGGAAAGGTTAAAGGAAGAATGATAGG 

AACTGGAATTCATGGAGGGGCAATTTATATTAGAGGAGAGATAGACAAAGACCAATTAGG 

TGTTGCTGCAGATATAAAAGAATTTACTGAAGAGGATTTAGAAAAAATAAAACCATACAT 

TGAAGAATTCTGCAAATGGTTTAATCTGCCAGAAGATGTTAAAAATAAACTATTGAATTC 

AAAATGGACAAAAATAGCACCAATCTCAAAAAGACCATTCGGTAAGTTATATACTCCTGA 

CTTAATGTGAAACTTTTAGTAAAAGTTTCATCAAAACTCGTCCATTAAGTTaGACTTTCA 

GTCCTAATTAATGTCCATTATTATAACAGTGGGACTGAACGCAGTGAAGCCCACTCTGGA 

GTATTCCAATAGGCGAAGCCCTATGGTTGCGGAAGCTCTATACTCCCCGACTTAATGTAA 

TTTAATAGAAATTTTTATCAATTTTTAAAACTATTTAGAAGAAACACCAAAATGAGCCTT 

AGGTGAGATTAATGAAATCTTACAAAAACCTAAAAGAGGAAGTTTGGGATACTAATAGAT 

GTAGTGGTTGTGGAGCTTGTGTTGCAGTTTGTCCAGTAAATAACCTATATTTTAGAGAAG 

AAAGCCCAGTAAAGTTTGAGTGCGATGAATGTTCCTGTATAATAGTCCCAGCAGATATCG 

TTGAGCATCCAATTTCAGCAGAGTTCTGTAAGACAGTAGTTTATGACGTCCCTTGTGGAG 

CTTGTTACGATGCCTGCCCAAGGATAAAAAAATCTGCTATTCCAAAACCAAAGGGATTGG 

GGAATATATTAAAGGCAGTTAGAGCTAAAGCATCAATAGAGATAAAGAATGCCCAAAATG 

GTGGAGTTGTAACAGCCATATTGGCAAATGCGTTTGATGAAGGATTAATAGATGGAGCCA 

TTGTAATGATGGATGACAAATGGACTTTAGAGCCAGAATCATATTTGGCGTTATCAAAAG 

AAGATGTTTTAAAGTCTGCTGGTAGCAAATACCTATGGAAAGGTCCAATATTAAAGGCGT 

TAAAAACAGCAGTTATGGAAAAGAAACTTAAAAAATTAGCTGTTGTTGGGACTCCTTGTG 

TTATAAACGCTATCTATCAGATACTATCATCAGATAACGACTTATTAAAGCCATTCAGAG 

AAGCTATAAGATTAAAAATTGCCCTGTTCTGTTTTGAGACTTATGATTACAGCAAGATGA 

TTAAAAAGCTTAATGAAGATGGCATAGAGCCATGGGAAGTTAAAAAGATGGATATCGAAT 

CTGGTAAGTTAAAGATAACCTTAATCAATGGAAACACTGTTGAATATAAGCTTAAAGATG 

TTGAGTCTGCAATGAGGAATGGTTGCAAGGTTTGCGGAGATTTCACTGGCTTAACATCAG 

ATATTTCAGTTGGTAATGTGGGAACTGAGAAAGGCTATTCAACAGTCTTAATAAGAAACA 

AGTGGGGAGAAGGATTCTTTAAGAGAGCAGTTTATAATGGTTATATAACCTATGATGAGA 

ACGTTGATTTAGAAGCAGTTGAAAAACTTGTTGAATTAAAGAAAAAGAGAGTTAAAAAGG 

ATTAAATTCAACTATAACTTTTTTTCAATAAACTCTTTCAAAACATAATAATCCAAGATT 

TCATTTAATAAATCATCATTATGGGCATCTTTATAGGTATTTGGTAATGGTCGTTTTATA 

ATCTCTTTATAGATTTCATTTAACCTATTTAGCCATTCTTCTTTTTTTACATTGCCATTT 

TTTAAATTATATGCAATATCCACAATTAGATTAATCTTATTTTTCTCAATTTCATTAAAT 

TCATCTAAATCAAAATATTCCTTAAATAGCATAACATAATAAAGGCATTTTTCATAATTT 

TCTTTTAGTAGATGGGCAATAGCCAATAAATCACTCATTGATTTGTCAAATCTTTTAATG 
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ATTTCATTGACAATCTCATTAGATATTTGGCAACTTTTAATTTTATTTTTAAATTCCTCA 
TAAACTTCTAAAAATCTTTTCCAATTTCCAGCCCCTATAAAAAACATCATCATAAAATCA 
AAGAAATCTGAATTGAAATTATATATATTATCTTCAAATAGTTTTCTTAAATGAGGCTCT 
CTGTTTAAGATTTCTTTTCTTTCTTCATAGCTAAAGTCAAAAGTTAAAACACTTTTTAAA 
5 ATATCCTCTGAATTAGCATCTTTAATAACATCGTTATAAATCCACTCTCCCCCACTATTT 
AAAAAGTTTTTTGTAATATCTAATTTCCCAATCTTTGCAAATATTGAGGCAATAGTTGAT 
AACTCTCCACTTAAATTATAATAATCTCCATCTGCCAAAATCCCTTCAACAAACCAGTCG 
AATTTTTTAATTCCTTCTTCAATGTTGTTATTTAAAATTTCTTTATAAGCCTCTTCTCTA 
AGTCCATAATAATCACATTTTGGGACATTTTCTTTAAAGATTTTATTAGATAAACTAAAA 

10 TACTCTATTGCTTTATCGTAGTCCTTTTTATTATAAAACTCACATCCTTTAAGCCAATAT 
TTGTATAGAAAATTTGACGAATCTTTGTTCATAATCTCCCAATTAAGCTCTAAACATCTC 
AT^AATAAAAAATCAACATATTAAAGAATCAATCTTCCTATTCTVATATACTTTGTCTTATC 
TTCMCTCCAGCCATTAAAAAAGCTCTCTTCCTTCTCATACAGCTCTCACACTTCCCACA 
GTGTAAAAAGTCCTCTCCATTATCATGATAGCATGAATAACTATATTTCAAAACCTCAAC 

15 ACCAAGCTTTTTCTCCAATTCAGCCCCTAATTTAACAATCTCCTCCTTTGTTTTGTCATA 
TAGAGGAGCTTCTATCTT7U\CCTTATTTAGTGTTCCATACTCCAAAACTTTATTAAATGC 
CTCAAC7VAATTCTATTGTGTTGTCTGGGAAAGTAACTCCTTCCTCTTTATTTATTCCAAT 
GAATATCTTCTCTGCATCCAATGCCTCAGCAAATCCGCTTGCTATACCAAACATGATTAC 
ATTCCTTGCTGGAACCCATACAGCCTTCATTGTTTCATAAGCTTTCTCACTATCTAACTC 

20 TTCCATTTTTAATGTTGGAATTTCCTTTTCAGTTATTAAAGAGCTTTTTCCAAACTGTTT 
AACGAATGGTAAATCTACAACAATGTGTTCAATACCCAAAATCTCACAAATCTTCTTTGC 
TGAATT7VATCTCTCTCTTAGCCGCTCTTTGCCCATAGTTAAAAGTTATTGCCGTAACTTC 
ATTU^CCTAAATCTTTAGCTATCAGTGTGACTACTGTAGAATCTAATCCACCACTTAAAAC 
AGTTATTGCCTTCATAATAATCACCTTTTAAAATTATATTAAAATTAAATAATTGGAATC 

25 

TTTGGAATTTTGCTGTAATAAAAAAAGGTAAAAGAGAAGAAATTTTAGTAGATTAAGTAT 
TTACCTGTCTCCCAGTCAGTAACTGCTGTTCTGAAGTCATCCCACTCAGCTCTCTTGATT 
TCCATGTAGTTTTCATATATGTGTTTTCCTAAAGCTTTCTGCAAGACTTCATCACATTCT 
AACTCATCCAATGCAGCAGCTAAGTTTGCAGGAACTGACTCAATTCCTAACTGCTTTTTC 
TCTTCTTCTGACATCTTGAAGATGTTTCTCTCAACTGGCTCTGGAGCTGTCATCTTCTTC 

30 TTAATTCCATCTAATCCAGCAGCTAACATACATGCAAATGCTAAGTATGGGTTGCATGTT 
GGGTCTGGAGCTCTGAACTCGATTCTTGTAGCTTTTCCTCTTGCAGCTGGGACTCTGATG 
ATAGCACTTCTGTTCTTGTTTGCCCATGCGATATTTACAGGAGCTTCGTAACCTGGGACT 
AATCTCTTGTATGAATTAACTGTTGGGTTTGTTATAGCAACTAATGCCTTAGCGTGGCTT 
AAGATTCCAGCAATGTAGCTTAAACATGTTTCACTTAATCCATTGTAAGGCCCTTCTGGG 

35 TCGTAGAATGATGGTTCTCCGTTAAACCAGACACTCTGGTGGCAGTGCATTCCGTTTCCG 
TTCATTCCAAAGAATGGTTTTGGCATGAATGTAGCTTTTAAACCGTGCTTCTTAGCAATG 
TTTTTGATTGTCATCTTGAATGTTATAACGCTATCAGCTGTCTTTAAAGCGTTGTCGAAT 
TTGAAATCAACTTCGTGCTGTCCTGGAGCGACTTCGTGGTGTGATGCCTCAACGTGG7U\G 
CCGAGGTTTTCTAAAGCTAAGACGATATCTCTTCTAATGTCTGGAGCGTCGTCTAATGGT 

40 TCAACATCAAAGTAACCTCCATCGTCAGCAGGAACCCATCTGTGTGGGTTGTGTGGGTCT 
CTCTTTAACAAGAAGAACTCTGGTTCTGGACCAACAAAGTATTCTCCATTCATTTCTTTC 
TTTAATTCTTCTAAAATAGCTTTTAATCTGCTTCTTGGGTCTCCTTCGAATGGTGTCTTC 
TCATCTTTATAAACATCACAGATAACTCTTGCAACACTTTTCTCTTCAGGTCTCCATGGT 
AAAACAGAGAGTGTTGATAAATCTGGTTTTAATAACATATCTGATTCTTCAATACCAACA 

45 AAACCGGTAATTGATGAACCATCAAACCAAACTCCATTTTCAAAGATTTCTCTTAATTCT 
TCGATTCCTTTTTCTCCAGCCTT^lACTGGGTATGCGACATTTTTTGGGAATCCTAAGATA 

tctacgaactggaatcttatgaacttaacgttgttcttctttacatattctattgcttgt 
tcgacgttcatttccatcccccaatgcaatttgattgaataattctggaagtaagtttcc 
tacttccatatatatataatttacggtatattcaatttaaattaaaatataaaaatttat 

50 tcataaatatcaagtgctctattgtaacactctatagcttcattaatttttccaagtttt 
tcgagagctatggctttcccattccaagcatctggaatatttgggttaatttccagcact 
ttatcaaaatattttatagcttcattatattttccaagcttgtttagtataatacccttg 
tatagataaagtaatgggtcatctggattcaattttaaagcttttttagtatattcaagg 
gcttgatttaatcttccaagataaatctvaaatttgtattatgtacattaatgcacgaata 

55 tctttattatttctttcaaaaactttttttagacattttaatgcttctccatatcttcca 
agtttaaataatatttctcctttgtacaataaggactggcaatctttgggatttattttt 
aaagcattatcaaaacattctaatgattttttaagtttgccttctctatataatatttcc 
cctttttcagcccaggcaatagctgattttggatattttttcaatattttatcaataatt 
tttaatgcataatcatactctccaagtttttttagtataaaggcagttacatatttaaca 

60 ggtaaatcagatttttctaatctgcataattttaagaatacctcttttgcttcttctaat 
ttacccaaacttaccaataaagctccttttaaaaaatttgctaaaatatattttggtttt 
aattttaacgctttatcaaaatattctaatgctttatcattttcccccaatgttcttaat 

ATTCTTGCTTTTCTTACATA7U\CATCGGGAGATTCCCTAACCTCTAAGATTTTGTCTATC 
AATAATAGGGCTTTTTCATAATTTCTTTTTTCMGTGCATCAAAATATTCATCCCATAAA 
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ATGCTTTCATTATATATTTCCATATTCACCCCCTCCCCCAAGGTTTTAGCAATATGCGAT 

TTTAATCCCCCTATTTTACGAATTTCGTTAAATTATTATTACTATGATATTATTTATTAA 

AATTATTTTAGTGTAAAATATAAATTTTTCTATCTGTGAATACTGGATATTTTCTTTTAT 

TTCCATATTATTTCCACATTAGTTTATTTAAAGTTAATAAGATTGGGGTATTAATTGTTT 

TATGACATTATACGCTATAATATAATTATAAATATAAAAATTTAATTATAAAAGTCCATA 

AATTACTTGTTTATCCCAATATTTGTTTATTTTGCATTTCCTACATTTTTATACTTGGCT 

CTATAAATTACCGAAAAGTTTTTATACTATTTTTAGAGTAGTTAGGAATGTAATTTCCTT 

TTCCCTAAGAATAAGATTTCCGTTTCCAAGTATATATATGGAGGCTGAAAAAAATGAAAA 

AAGTTGAAGCAATCATAAGACCGGAGAAGTTGGAGATTGTTAAAAAGGCTTTGTCTGATG 

CTGGATATGTTGGAATGACTGTTAGTGAGGTTAAGGGTAGGGGAGTTCAAGGTGGAATAG 

TTGAGAGGTATAGGGGGAGAGAGTATATTGTTGATTTAATTCCAAAGGTTAAGATTGAAT 

TGGTTGTAAAAGAGGAAGATGTTGATAATGTTATTGATATTATATGCGAGAATGCAAGAA 

CAGGAAACCCAGGAGATGGAAAAATCTTCGTCATACCAGTAGAAAGAGTCGTAAGAGTAA 

GAACAAAAGAAGAGGGTAGAGATGTACTTTAAAAATTTAATTATGTAATTTAAAGAGACT 

TGTGGGGTGAAAACATAGCTACTGCGGATTTGTTTGCGAATGCCACAGATATACATTCAA 

TAGTTCAGGCATTGACCACCTTAGCAAATGCTTCAGATGTGTTCTTCCTTGTAGTAATGG 

^^^IInr''''''''"''''''''^^''''^^^^°^^T"=CGATGCTTGAAGGTGGTCAGGTAAGGA 

AGAAAAATGTTAATAATGTTATGATGAAGAACATGGTTGATTGGTTGATTGGTTGTGTTG 

CATGGTTATTCATTGGTGGAATTTTATGTTCAAAAGGTTTTGATTTATCTGCATTTATAG 

ATTGGTGGAAACAAATACTTGGAACAAACTGGCCAAATAATGGATTGGACTTAGCAAGCT 

GGTTCTTTGGTCTTGTCTTCTGTGCTACTGCTGCAACAATTGTCTCTGGAGGAGTTGCAG 

AGAGAATAAAATTCAGTGCTTATGTTCTAATTTCATTGATTATTACAGGTCTATTATATC 

CTCTCTTCGTATATTTAGGACCTTGGGGAGCAAGTATAGTTCCATGGCATGACTATGCTG 

^^^ir^^^^^^^^^'^'^^^^^^^^'^^^'^'^TTTTAGCTTTGGGAGCAATTGCAGCATTAG 

GTCCAAGAATTGGAAGATTTGTTGATGGAAGACCAGTTCCAATATTGGGACACAACATTC 

^^J^^^^^'^^TTTGGGGCATTTGCATTGGCAATTGGTTGGTATGGATTCAACGTAGGTA 

^^J^^J^^^^'^T'^^^^^^^TATTTCAGGGCTTGTATGTGCTACAACTACAATGGCAATGG 

CTGGAGGAGGAATAGGGGCATTAATTGCTTCAAGAAATGATGTTCTATTTACAGCCAACG 

™^n'^^^^^^^'^'^'^'^^^'^^^'^^^'^^T^^^G^GACAGATGTTGTTAGCCCAATAGGTG 

GATTAATAATTGGTTTAATTGCTGGATTGCAAGTTCCAATTGTCTATAAACTTGTTGAAA 

AAGCAGGATTGGATGATGTCTGTGGCGTAGTGCCTGTCCATGGAACTGCAGGTGTTATAG 

S^^S^^^^'^^^'^^'^^^'^^'^^^^^'^^^^^^TATTTGGTGGAGCAGGAGGCGTTAGTT 

TAATAGACCAGATAATTGGAGCAGTATTTTGTATTATTTATGGAACAGGGCTTGGATATA 

TTTTAGCGAAGATTGTTGGTATTGCATTAGGTGGATTAAGAGTTAGTGAAGAAGAAGAAA 

^ISS^^^'^'^^^'^^^^^'^^^^^^^^^^^CCTGCTTATCCAGAAGAGACAGTTATCT 

AAAATTCTTAATTTATTTTAATTTATTTTTGGACAATAATTATTTAATCCTAAACCAACA 

ATATCCGTTTTCTTTTATTATTACCTTATTTCCATCCCATAAATTTATTTTGTAATCTTT 

n^^SJ^^''^'^'^^^^*^'^^^^^^^°^C^TATTGAGGGAATTGGACATGCACTTCCTGG 

AATTTCTGCTTCTCCAATCTGCTTTGGACAATATTTACACTTATCAATTTTTATTCCTAT 

TGAGGGATATGGTGAGAAGATGATTCTCAAGCATAGAAGACCAAATATATATGGAT?IJ? 
.^^^^'"'^'''''^'^''^'^^^^'^^'^'rGAGATGATAATTAACGAGTTATTAAATAG^^ 

II^I^'''^'^°''''^^''^=^'^t^'^^g^gttcagcagtctttttatcSgtggJ?ag? 

AAAAATTTATAGTAACGAGATTTCAATCCCAGATATGGGAGGTTGGCAGGGAT??T?J^ 

atttcctaaattattgaatctaaaaaataatatgatagaaacgaatttgggaatta™ 
tttagaaaaattagatgaaagtttaaaagaaaactcatcacttattttaac^ct??Ig? 

TGGATATTTAGCTCCACAACCATTAAAAGAAATAAAAAAATTATGTSGGAGlGlSrT 

tttatttattgaagatatttcaggaaaaattggaggagattgtggaS5ggaga?a??g? 

T^r^I»'''''^''''''^''^^^''^^^^^T'^TTAAACTGTGAATACGGTGGTTT?T?lG^^ 
TAGTAAAGAAATTGAAGAAAAATTAGGTAATGCTTTAAATGACATTAAAATTTTATCCAA 

aacatataaaacaataaactattttggacttttaaaagaggaS?I??^"SIS^ 
^^cgtataagaaatatgtagaggcatctaaaataattaaagatgamttgaaaa?^?^ 
ttttagagagtttgagggaatatctgtatttattgaatgcgataatccaSaa^SSc 
taaaaaaataaacagtttaataaaattggacaatagaaaatcaataacJJ?^??gSc 

!^^Jf^J^^^^G^TTTTAAAAAATGGGATTGTATTTGAAACAAlSJ^5??Ml?c5c 
IS^r°'^^'''^'''^'^°^'^^^'=^^^'^TTATTATAGCATTAAGCTCTATTTTA?^ 

at^atattattttagaacattgcttttattttttctgcggctcttttSttcttaJ???^ 

ATTAAACCTAAATTAACTTTCGCATCAGTTAGGACTACTAAAATTCcScTCCT^STrr 
ACCATTAGGGTTTTACCGTGCTCTCCTTCAATCATTGT??^SS^GmSA?SA 
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ATTTCTGCTGCTGTTCTTTCAGCAGCCCCAAATGCTGCTGAAGCCATAGCCCCAACTAAC 
TCAGCATCAACACTCCCAGGCAATTGAGAGGCAATAACTA7U\CCATCCTTACCAACAACC 
ATAGAACCCTTAATACCCTCAGTCTTATTCAACTCCAACAAAACCCTATCAATCATATTT 
TCACCCTTATACTCTGGCTTATGTAAATTATGCGTTTTTGCTATATATAATTTTTTAAAT 
TTATTTGTTATGTGTGGGACATAATTTTATTTATAAGTTCGTCAAGTCCATCTTTTTTCA 
CTGCACAACCTTTAACAATGAATTTTGGGTTGCAAAAGTTATAAACTTCGGAGGTATCGA 
TATCTCCAACATCTGTTTTATTTATAAAAATTCCGTACGGGATTTTTTTAGATTCTAATA 
ATTTTATTATTTCTTCATCTTCTTTAGTTATTCCTTTTGATGCATCCAACACTACTAAAG 
CAAAATTAGTCCCTTTTAATGCCAATTCTCTCATGAACTCAAATCTTTTCTGCCCTGGAG 
TCCCAAAGAAGTGTATCTTCTTATCTTTTATTGTTAATGAACCATAGTCAATAGCTGTTG 
TAATTCCTTTGTATTC7VACTTTTCCAATTTTATCAATTAAATTTTCCATTAATGTTGTTT 
TTCCAACATCACTTGAACCAATAACTACAACCTTAACCTCATCTTTTTTCATGAATCTCC 
CCTAAAAATAAAATATTATAAGTTTAATGCTCAAATAATTTTCTTGCACTTCTTTTTATT 
GCCCCCTCTGCAGCAGCCCCTCCTATTAAAGTTTTTATCTCCTTAACTAATGATTGAGCC 
GCTACAAGTGTTGTTGGTTTTATTTCTGCTTTTTCTACTTCTGCCTTTAATGTTTCATCT 
AAGTTTTGAAGAATCTCTAAGGCAGCATCTAAATCCTTCTGTCTATCTAATAACTTCATT 
GATGAAGCACTCCTTATTAATAACTCTGGATTTAATGCCTTTACCATCCCTTCAATACCT 
GATGTTTCAACAAGAGAAGCCATTGTCTGTAGGGTCATGATAACTTGCTGTTCAATCATC 
TTCTTAGGAGCATTTATAATCTTTCTTCCAACTGTATAGTAGTCCAGAACTCCCGATAAA 
GCAACTGCAGTAACTAAACTCCCCATATCAGCAACAACTGAAGAAACATCAGCAGGAACA 
ACATAAGCCTCTTTTCCTGCACTTTTGGCAAGCTCTACAGCTTTTTTTATCTGTTCCTCA 
GTAGCCAATTCTTTTCCATCAGTGGTTTTCCCACCAATCACATAATGTCCGTGTTGTGGA 
GTTCCAGGAACAGCTGCTGGATGCATTGATGAAATTCCTACATCCTTTCTTTTTGTTCTT 
AATATTGGTTCTAATGAGTAGTATAACACTACAGGTGAAACAGTGCAGGTGTTACAAATA 
ACAGCATTTTCAGGAACATGTTCAATAATTGTCTTTGCTATTCTAAATGTTGCCTTACCA 
AAAGGGGTAAATAAAACATGAATTTCCCCGTGCTTTGCAGCTTCGACATCATCACTAACA 
ACCTTAACCCCAGCATCTTCAACCTTTTTCCATAAATCATCACTCATTATGTTTTTATTT 
GGTTCAGCTAAAACAACATCATGCCCTGCCTCAGCAAATTCAATAGCCATCCTTGAACCG 
CCATAAGGTGGCTCCCCACCGAATTTTTCTGGAAGGTTTAATTTATTTATGTATAGATTT 
TGATTCCCCGCTCCATATACGGATACCTTCATGTTATCAACCTTGATTTTTCTTATATTA 
CGCATTTAGAGTTTCTGATGATTATTATATAAATACTTTATTATTTATTTTTCTTATTAT 
TATGCTATTCACCGCCCTCAACTACTTTAATGTGTTCTCTA7VAGCTTTTGGAATGGCGTT 
AATTATCTCTGTAATTTGAGTATTTGGATTAACATTAGCTATCTCTCCAAGTTTTCTATT 
TATTTTACATCCTAAACTCAGTGCTTCCTCTGTTTTAAAATCAACACTTATTAAAGATGA 
TACTATCTCTGTTAAAGTATCTCCAGTTCCACCAATACATTCCATTGCCTTTATTTTTGG 
TTCTTTTATTTTATCAATTATCTTTTCCTCTCTAATAGTATAGTCAGTTTCCCCTTTAAC 
AACCATATACTTTGGCATTTTTAGTTTATAATCCCTTTCAATGAGTTTTGGCACTTCATT 
ATCGTCTATCTCAGATATAAAACCTCTAACATAAGCTGGATGAGAAGCTTTTTCATCTGC 
TAAGAATGCCAATTCACCAACATCAGGCAAAAAGAGATAAAATTTATCTCCAATATTTGC 
TGCCTTTGCAGCATACATTCCTCCAGCATCTGCAATAATCTTTGGAGAGAAATTTATCTC 
TCTAATTTTTGAAATCTTTGGTTTTATATAATGTATTATAACCAAATCATCATCTATCTC 
TTTTAATGCGTCATAAATCTTTAAGCTTCCATCCCCTTCTCCAATATCCCCTGTAGTTAT 
TACTTTAACATCTTTTTCATCAAAATACTCCAAAGTTTTTAAAACAGCCCCTATCAAAGC 
TCCGGCCCTCCATAGAAATAGGAAATTCCTTGTTATTTATTATTTTATCTCCTTTTAAAA 
TGGGTTTTCCAATAGTTAAATCTAAACCTTTTATTGGCATAGTTCCTGCTATAATCATTA 
GTAATCCCTACAATTTATTCATTTTGAGTATTAACTAACTCCTTAGCTTTTTCAAACGCC 
CTTCTCTAACGCATAGCCCAAAACTGCGATAAATGTATCTCCAGCCCCTGAAACATCATG 
ACCTCTTTGACTTCTGTTGGAACATGGTAAATATTTCATCAACAGTTATTAATGTAGCTC 
CTTTTTCACCTCTCGTTATAACAAAGTTTGAATTGTATTTATCAACTGATTCCAATCCAG 
ATTTTTCCAACTCATCATCTTTATTTTCTATCTCCCTTCCTAAAATTTGGGAAGCCTCTT 
TTAGATTCGGTTTTATTAAATAGACATCCTTATAAAAGTCATTTTTTGGTTTTGGGTCAA 
TTAAGATTTTTCCCTTAAATTCTTTTTTTATGTCATCCATGAGTTCCTTTGTAATTAATC 
CCTTTGCATAATCAGAGATTACTAATATATCTGATTTTCCATTGAGATTTTTAATAACTC 
CCAAAATTTTACTGCTTAACTCATCGTTTATTGGATAGATTTTTTCATAATCAACCCTAA 
GCAATTGCTGATTATAACCCATAGCAACAAATCTATGCTTTACTATTGTTGGCCTTCTTT 
TTTAACGTTTCTCTATCTTCAATTATCATCTATTTCACCAATTTTTTCTCMCCTCCTCA 
CATATAACATGATAAATTGTTAGATGGCACTCTTGTATCCTTGCTGTGTCATTAGAAGGA 
ACCACCAATGCCAAATCAACAATATCCTTTAGCTTTCCTCCACCTTTTCCCAATAAACCA 
ATTGTATAAATCCCCATTTCCTTTGCTTTATTAGCTGCCTTTATAACGTTTTCTGAATTT 
CCACTTGTTGATATACCGGCCAAAACATCTCCTTCTTTTCCCAAAGCTTCAACTTGCCTC 
TCAAAAATCCTATCAAAACCATAATCATTTCCTATAGCTGTTAAAATTGATATATCTGTT 
GTTAATGCAATTGCAGGCAATCCTTTCCTTTCTAACTTAAACCTTCCTACAATCTCAGCG 
GCAAAATGCTGAGAGTTAGCTGCACTCCTCCATTTCCACAAATTAAAATTTTATTTCCAT 
TTTTTAATGCATTATATATGACTTCAATAGCTTTTTTTAACTTTTCCTCATCTTCTTCAA 
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10 



15 



20 



25 



30 



35 



50 



55 



TGAATTTTAGTTTCACATTTGCACTTTCCTCGAAATACTTTTTCATAATCATCACCAAAT 
TATTTATACCTACTATATTAATACTATATAATTGTAACATAAATTAATTTAATAAAAATT 
TACTAAAGGGAGAGGATGATAAGAAAGGCAGTAATTCCAGTGGCTGGTTTTGGGACTCGA 
CTATTACCAATAACAAAGGCTCAACCGAAGGAGATGCTTCCAGTAGTTAATAAGCCAATA 
GTGCAATATGTTGTTGAAGATTTGGTAGAAGCAGGAGTAAAGGATATTTTATTTGTAACT 
GGGAAGGGAAAACAGGCAATAGAAAACCACTTTGACGTAAATTATGAGTTGGAGTGTAAA 
TTAGAGAAATCTGGAAAATATGAACTTCTAAAAATTATTAAAGAAATTGATAGGTTAGGG 
AATATATTTTATGTAAGACAGAAAGAGCAGAAAGGTTTAGGAGATGCTATTTTGTATGGG 
GAGGAATTTGTTGGGGAGGAATACTTTATAGCAATGGTTGGAGATACAATTTACTCTAAA 
AATATTGTAAAAGATTTAATAAMGCTCATGAAAAATACGGCTGTTCAGTTATTGCATTA 
GAGAGAGTTCCAAAAGAAGATGTTTATAAATATGGAGTAATTGATGGGGAAGAGATAGAA 
AAGGGCGTTTATAAAATAAAAAATATGGTAGAAAAACCAAAAGTTGAAGAGGCACCTTCA 
AATTTGATTATAACCGGGGCTTATTTATTATCTCCAAAGATATTTGAAAAAATTAGAGAA 
ACTCCTCCTGGAAGAGGAGGAGAGATTCAGATTACAGATGCTATGAATCTACTTTTAAAA 
GAGGAAGATATTATAGGGGTTGAAATTAACTGTAAAAGATATGATATTGGGGACGCTCTT 
GGATGGTTAAAAGCAAATGTAGAAATTGGAGCTGAAAGATTCCCTGAATTTAGAGAATTC 
TTAAAAGAATTCGTTAAAAATTTATAATCTAATTTTATTTTTTATTAAGTTGGGATAGTA 
TGGATACAGCAATAATATTGGGACTTTTAGTGGCTGTGTTTTATGGGGTTGGGACATTTT 
TTGCGAAAATTGTCTGTGAAAAAAACCCTTTATTTCAATGGATAGTGGTAAATATAGTTG 
GGATTATATTATGTTTAATCATATTACTCAAATATAAAAATATAATTATTACTGACCAAA 
AAATTCTTACTTATGCAATAATATCAGCAGTCTTAGTAGTGATTGGTTCTCTATTGTTAT 
ATTATGCGTTATATAAAGGAAAAGCAAGCATTGTTGTGCCCTTATCATCAATAGGTCCAG 
CGATAACAGTAGCTCTGTCAATACTGTTTTTAAAAGAGACTCTAACACTTCCACAAATGA 
TTGGGATAGTTCTTATAATTATTGGGATTATTCTCCTTTCAATATCTAATTAATTTATTT 



AATTTATAAAGTTTAAATTTATAAGGTAATAAAAAATAAAGATAAAAATAGTTACTGCCC 
TTCTAAGGTTAATAAATATCTTCTTGCCCCTTGCATTCCAAGCTGTAATTTTATTGCCTT 
TATAAGTTCATAAACATCTTCAACAGTTTCAGCATCATATAACGCTTCCTTAAATTTCTC 
TCTTTTTGAAGGTTCTATCAAATTTAACAAGTATTCAACTGTGGCAACCAACTCAATATA 
TTTCCTCTCTCTATTCATTAAGTTATCTAATTCATTTATTATCTCTTGAATTTTTGGAGA 
CGGCATTCATTTCACCTATTGTGTAATTTTAAATATCATTACTACATAAAGTCATATAAA 
TATTTTAACACCATACTCAATATTTTTATGGTGAGAACTTGGCAATGATTGGTTTAGTAG 
GGAAACCAAACGTAGGGAAATCAACAATGTTCAATGCTTTAACTGAAAAACCAGCAGAAA 
TTGGAAATTATCCATTTACAACAATACAACCAAATAAAGGTATCGCTTATATAACAAGCC 
CCTGTCCTTGTAAGGAATTGGGAGTTAAGTGTAATCCAAGAAATTCAAAATGTATAGATG 
GGATTAGACATATTCCAGTTGAAGTTATAGATGTGGCTGGTTTAGTCCCAGGAGCACATG 
AAGGTAGAGGGATGGGAAACAAGTTTTTGGATGATTTAAGGCAAGCAGATGCATTTATAT 
TGGTTGTTGATGCCTCTGGAAAGACAGATGCTGAAGGAAATCCAACAGAAAACTATGACC 
CAGTTGAGGATGTTAAATTCTTATTAAATGAGATAGATATGTGGATTTATAGCATTTTGA 
.r. *^GAAAAATTGGGATAAGTTGGCAAGAAGAGCCCAACAAGAGAAGAACATAGTTAAAGCTT 
TAAAAGACCAATTAAGTGGATTGAATATAGATGAGGATGACATAAAGATGGCTATTAGAG 
ATATGGATGAAAGCCCAATTAAATGGACTGAAGAAGATTTGCTAAACTTGGCTAAAAAGC 
TTAGAAAAATTTCAAAACCAATGATTATCGCTGCAAATAAGGCAGACCACCCGGATGCAG 
AGAAGAATATTGAAAGGCTAAAGAAAGAGTTTAAGGACTATATAGTTATTCCAACATCTG 
*=^^GAGATAGAGTTAGCTTTAAAAAGAGCTGAAAAGGCTGGAATTATAAAAAGAAAAGAAA 
A'^GACTTTGAGATAATTGATGAAAGCAAAGTGAATGAACAGATGAGGAGAGCTTTTGATT 
ACATAAAGGACTTTTTAAAGAAGTATGGAGGAACTGGAGTCCAAGAATGCATAAATAAAG 
CTTATTTTGATTTGTTGGATATGATTGTTGTCTATCCAGTTGAAGATGAGAACAAATTTT 
CAGATAAGCAAGGAAATGTATTACCAGATGCATTTTTGGTTAAAAAAGGAAGTACTGCAA 
GAGACTTAGCTTATAAGGTGCATACAGAGTTGGGAGAGAAATTTATCTATGCAATAGATG 
CAAAGAAGAAGATTAGAGTAGGAGCTGATTACGAATTGAAGCATAATGATATTATTAAAA 
TTGTCTCTGCCGCAAAATAATTAAATTTTTGGTGGCCTCCATGGCTACAACTTATGAGCT 
GAGAATTTATGGAAATGTGGAGTGTGCTGAATTTATAGATAT^GTTGAGAGTTTAGGAAA 
ATTGTTGGATGTGAATGGGGTTGTTTATGTTTATAAAGACAGTGTTAGGATTTTGGCAAA 
CTTTCCCAATGAGAAGAAAAGACAGCTTTTTAAGGAAATCATTAAAGATTTAGAAGATGA 
TGGTGGGTTAATAAAGGTTGAAAGGATAGAAGAAAGAGATTTAAATACATATATTGAATT 
TCCTAATGGATTGAATAAGATTTCAACGAATGAGTTAAAAGAGATTAATAAAAAGTTGGA 
TAAAACAATTAGCTATTTAGAGAATATTTTTAATGCCTTAGAGAAGCAAATAAAAGTTTC 
AGAGGAGATTAGAGACATATTGAAAGATACCTTTGAAGTTTAACTTTATTCAAACACCTT 
„ ACTCATACACCCAGCCAACAAACCGGCTATAGCATCATCTAAGAACATAAAGCCTTTCCT 
OU ^'TC'^^CTCTCCAATTATTCCGGGTTTTTTAGCATCATAGAATCTAAAGTTAAATATTGC 
CTTAGTTCCAGCAATCTCATTTGCTATAGCTAATCCAATAACCTCATCAACATACACATA 
GTTTGGGTCTTCGTTGTAGTTGAATGGCAGATTGTTAGCTCTGCCTTCATTATCCAACAA 
AATTGCTGCAATTAAmAAAGTTGAGACATTAGGGTTAGACAACTGCTTTAAmAAAATCTC 
CTTAAGTTTCTCTTTAACTCTGTCTCTTTCTTCATTACTCCCAATATATAAATCCATTCC 
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AGCATCCAATAAGCTGTCAATAGTTATTCCA/VACTCCTCTAATTTTTTnATAATnATTTT 
CTCACTTTAATATTTTGGATTTTATGGGCATATCTCTTATAATAAACATTAAAATAAAAT 
CTCATTTTTTATTAAAAAACTTAAATATTCTATACTATTTTgTTGTTATCTCAACACCAT 
TTTCAGTTATTAATATTGTGTGCTCAGCCTGGCCAACTATTCCATTCTCCCTCTCTTTTA 
5 ATATTGGATAACCGTAAATGCAAGATGCCCTAATTTU^CGAGTTTAAAGCCAGCCTCTCGC 
TCTCATTTTTTAAAACCCATCTTTCAGCAAAGGGTAGATAAGGGTAATTTTTTGATATAA 
CGTCTAAAAGTTTTCTTGCTTGTGGCAATCTAATTGGTCTTTTGGCTAAAAATTTATATA 
TGTTTCCAAGATTCCCATCTTTAACCATTCCAAAGCCATCTGTTGCAAACGGCTCTATAG 
CCACCAAATCTCCAACATCTATATATTGATTGGTTCTTTCATAGACATTTGGAATACTAA 

10 TTCCTGTATGCAACTCATATCTATGCATCACATGTCCAGAGAGGTTGGATATTGGTTTAT 
AACCATAACTCTCAATAACCTCCTGAATAATCTTTCCCATCTCTCCAATGTTCATTGGAG 
GGTTTATCTCCTTAATAACTGTATATAGTGCATCTTCAGATGCCTTTACCAAATCTTTAT 
AAGAGTTTGATAAATCTACTGTTATAGCTGTATCTGCTATATATCCATCGACATGAGCTc 
CTAAATCTAATTTAACAACATCATCATCTTTAAACTCCAAGTTATCATTTAATTTTGGAG 

15 TGTAATGAGCTGCTATCTCATTAATTGATATATTGCACGGAAATGCTGGCTCCCCkCCTA 
ATTCCCTAATTCTATTTTCTVACAAATTCAGCAACTTCTAATAGCTTAACTCCTGGCyTTA 
TTAAtTTTACGGCCTCCTCTCtGACyTTAGATGCTATTTTCCCTGCCTCTATAATCTTTT 
CATACCCyTCAATCTCCATACTTTCATCCTTTAAGTTTTGGTTTTAATAAGTTTTTtAGT 
GTTGTTTGATAACCTTTAAATTGATTGTTATTTATTGCTGAAACTATAATATAATCAATT 

20 TGTCTTTCTAAGTTtTCAATATCTTTCCCATTTTTTAAGtTGTAGATAATATCCTTGACT 
ATATTTTTTATTTTCTCCTTAACTTCATCTGAGAACGTGGGAATTTTTGCATTCTCTAAC 
AACTTTTGTGTAAATGCCACTCTCCCTCCTCTTCTTCCACCATTTGC/UXGATAATAATTT 
CTAAAAAATGAGCTGTTCAAATATCCTAACAAAAAGTATATGTCATCATCGTTGTAGGGT 
TGGATAAATATAACGTCTCCTGAAGGTAATAGTTCATCATCTCCTAAACTAAACCTATTA 

25 TATGGTTTTCTGTCTAAAGTTGGAACATATATTCGTTTTTTATTTAGATTTTTTATTAAA 
AATTTATAGTTTCTCAATGCCTGCCAATTAAACCATTTTTTGTTTTTTGGAAGGTATCTA 
TTCTCCATTCTGTCTTTAAACTTCAACAATTTTTTATATATGTTTGGATATTTGGTTTTA 
AATATTTCTTCATCTTTTAGGTTGTCTTCAATTAATATATATTGAACAAATCCCTCAACT 
ACAAACCTTTTACAGTTTTTTGCCTTAACPlAAATTTTTTATAAGTTGTTTTTCATCTTCA 

30 TTTAGCTTTGAGATGTCATCTTCATTTAATAAAAATGCCTCATCAAATCCAGAAACTAAG 
CCCACTCCAACTTTTGCTATATCCTTTAATAACACATGAGGAAAATCTGGGATTTTTGTA 
AAAAAGGTAGACCAAGGTTTCTGGTGTAGTGTAATGAGGAATTTCCATGTATTCAAAAAT 
ATCGTTTGATTTTTTTGCATTTAACACATTAATAGCTTTTTCTTTTATCTCTTTTAATTT 
AACTTTTTTTGAGATAATTCTAATAACATCAATTTTTTCAGATTTGTGGTTAAATTCTCC 

35 TTTTTTATVACTTAAATATTATTGTTTCAGGATTTTCATTTTTAAATAGCCTAACTTCATC 
CAAATCTLATAATTATTTCCAATTTTCCATGTTTTAGAATGTTGTCTCTTACAATTTTTGC 
ATATGTGTTATAAAAAAAGTGATATGGAACAATATAAATCAACTCTCCACCATCTTTTAA 
AAGATTTATTGATTTTATAATGAAAGCATAATAAATGTCCCCCTCACTTGTGCCTATAAT 
CCGTTTTACTTCTTTTTTTATAAATTCTGGAAGACTGTTGAAATGGGCATAGGGAGGATT 

40 

TCCAATAATTAAATCAAATTTTTCTTTAAAGTTATAGCTTAAATAATCTCCTAAAATTAT 
CTCAAATTCATCAAATTTTGCCTTGCAGTGGTTGTATAAATCTTTATCTATTTCAATACC 
CACACAATTTTTGTATCCAAATTCTCTTAATACCTCTAAAAATATTCCTTTTCCGCATCC 
AGTGTCTAACACTAATCCATTTTTTGGGATTGTAGAAAGATTTATCATTAATTCAGCTAT 
TTCTTTTGGAGTTTCAACAAAGCTAATCTTCTCCATGTTTGCCCTCTTTAATATTAATCA 
45 GTTGGATTTTTATTGAGTTCAAGACAAGAATTTCTGTTTTTATTGAATTTAATGACTTTT 
CTAAAGATTCTAAAATTGCATCTATAAGTTTTTTAACAAAAGCAAAGTAATCTATTGGGT 
CACCAAGATTTGGGCTATACCTTATCTGAAACATGTTATTTCTTGGATTAACATAAAAAT 
CGTCCTTAATCTCTTCTAATAAGAAAAATTTGTATCCTCTTCTATACTTATCAACAACAA 
AAATTCCATAAGATACAATTTCATTACCTGAAATTCTATTATAGACCAATTCTCCATTTA 

50 

TGGTTATTTCTGACGTTAATCTCTGCCGTTTTATGCCCAAATATTVAACGTAAGAGGTTAT 
TGTATGTTGTTACATCGTTTCTTGTATTACTTTTTAAATCACCAAATTTATTGTTTATGA 
ATATTAAAAAATTCTTTCCATTTATTATTCCATAGCATAATAAATCATAAGGTGTCCTTC 
TACCTTCATAGGAATATATTTCATCCTTTGGTTCAGAAAATTTAATTTTTAAGTTTTCTT 
CAGAGATTGTATAGTCTTTTAATAAACTTAAAACCTTATCGTCC/^TVATTTCTATCGTTAA 

55 TATTCTCATCATTTTTTATTTTATTATAAATTTCCCTAAATATATCTCTTAATTTTCCCG 
ATAAAATTTTAAGTTCGTTATTAACCAATTTTACCACCATTTAATGCATTATCTCTTTAT 
GACAATTGTTAAAATGTCTCCATCCTCTAATTTATGGTCTAATCCAACTCTCTGCCCTGG 
ATGCTTTGCTGACTTCCCCCAAACTTGGGCATACCTGAAATTTCTAACGAAATCTTTATG 
CAGTTTTTCACATVACATCTTTTACAGTAGCTCCTCTTCTCATAATTAGTGGTTCATCAAA 

60 GTCTGGCTTTTTCCCCTGTGGTTTTAGATAAATCTTTATAAAACCCAATTTCTCATAGAT 
TTTCTCTTTCAATAAATCCAAGTTAATTCCTTTGTTACCAGAAACTAAGATATAATCCTT 
ACCAAATTCCTCTAACTTTTGTTTTATATATTTTAGATACTCCTCATCAGCTAAGTCTAT 
CTTATTAACTACCACCAAAGAAGGGATATAAACTCTGTTTCCAGCTACAACATCAATAAA 
CTGCTCTAAGGTTATATCCTCCCTTATAACAACATCTGCGTTGTGTATCCTATATTCATT 
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TAATATTGCTTCAATTGTATCTTCATCGATATGGGTTAATGGAACGGTTGAACTAACGTT 

AATCCCTCCTCTCTCCTTAACTTTGATTTTAACATCTGGAGGAGTTTGGTCTAATCTAAT 

TCCAACATTGTAGAGTTCTTTTTCAAGCACTGGTAGGTGGTCTAATGTGTAGATATCAAC 

TGTTAATAAAATCAAATCAGCACTTCTTACTGCAGATAAAACCTCTGTCCCCCTACCTTT 

CCCTGATGAAGCACCAACAATAATTCCAGGAGCATCTAAAAGCTGAATTTTAGCTCCCTT 

ATATTCTAATATACCTGGAACAATTGTTAAAGTAGTGAAAGCATAAGCCCCAACTTCCGA 

TTTAGCATTTGTTAATTTATTTAGCAGGGTTGATTTCCCAACAGATGGAAATCCTACAAA 

GGCAGCTGTAGCGTCTCCACTTTTCTTTACAGCATAGCCCTTTCCTCCTCCACCTCCCCC 

TCTACTCTGAGCCTGCTCTCTCAACTTAGCTAATTTAGCCTTTAACCTACCAATGTGTTT 

CTGTGTGGCTTTGTTATATGGTGTCTTTTTTAATTCTTCTTCTATCCTTCTAATTTCTTC 

TTCAATTCCCATAACATCACCAACAATATTAAATTTATGGTTTCTAAAAATAAAAATCCT 

CTTAAGAGATAACAAAAATAGCATTTATAATTTTACGCATGCATTTATTATAAATTGCGT 

TTGCTACATTAAATAAT^TAGTTAAAAAAAGAGAAATTTATAGTTTCCTCTGACTACCTA 

AGAAGTCACATTCTTGTTCTTTATAGAGCTTGGACATTAATTgGGGCTGAAAGCCCCAAC 

TTAATGGACGGGAGGTATCCCAATAGGAGGTCTCCTCCTATGGTTATAATTCATCAACTA 

ATTTAATAATCTCTTTAACTCTGTCATCTACGATGTAGTAGAAATTCCATGTTCCTTCTT 

TTCTTGCTTTAACTATTCCAGCCTTTTTTAAGATGTTTAAGTGGTGTGAGATTGTTGGCT 

GTGGCTTTTTTAGCTCATCTATTATTTTACAAACGCACATGCTTCCATTTTCAGCCAATA 

ACTTTAAAATCATCAATCTTGTTGGGTCTCCAAATGCCTTGAAAATTTCTGCCGCTTTTT 

CGTACTTCTCCATTGTTATCCCTCGTGATTATTTTTTATTTCTATTAAAATGTTTAAGTA 

TATATTTAGACATATATTTTTCCATATTGATGTTTAAATTGCAATCAACTTTACTATAAT 

ACATTACTAGTATTTAAATATTTTGGTTTTGTTTATTTCAGCTTGGATACGTCAATCTCA 

ACCTTGATAGTCTTTGGATTATCAACACCTTTCAAGGTTTCAATCATTGCTTTAATTGTG 

CTACCAACAATCTTAGAGACAAAAGGAACAGCAGGAATTATTTTACCATCAACAATTATT 

TTTATACCTTTCGCTAATACACAATCATCCCATCTTGCTTCTCCTTTAACAACTGCCTTA 

ACAAATGTTCTGCAGTTATACCCACAATGTCCACAGTTTAAGTTCATTGTAGGAACTACG 

GCTTTTTCATAAATAACTTTTAAAACATCGTCAATGTTGTAGTTGTAATCTTCAATTATC 

ATTGCTGTATGGTCATCAATCAAATCACTCCCATCTTTATCCTTAAGCATAACTATCTTA 

GGGATGTTTAACCTTTTTAAAGCCTCTTT7VAAACCTTCTATAATAACAA71ATCTATATTG 

TAATCTGATAATACTGATAAAATGTTTTCTAAATCCATCCTATCTGTAAAGAAAACTGTT 

TTACTGTCAGTTGCTAAAACTGTTATTTTAGCTCCCGCGTTTGACAATCTGTAAGTATCA 

GTTCCTTTTTTATCTACTTCTACATCTTCTTTAGTGTGCTTGATAACTGCTATTTTTTTA 

TCAGAATGTTTTAGAATTTCTTCAATTAGGGTTGTTTTACCAGAATCTTTATAACCAATA 

ACGCCTATGACTCTCATGTTATCACCATAAATATAAAAACTGTAGGTTTAACATATTTAA 

ATTTTATGCATTAATTATTCTATCACAAAATAAAAATTTGAGGGATAGTATATGATGTTC 

GTTCATATAGCTGATAATCACTTAGGTTATAGACAGTATAACTTGGATGATAGGGAAAAA 

GATATTTACGACTCATTTAAATTATGTATAAAAAAGATTTTAGAGATAAAGCCAGATGTT 

GTTTTACATAGTGGTGATTTATTTAACGATTTGAGACCTCCAGTAAAAGCTTTAAGAATA 

GCTATGCAGGCGTTTAAAAAATTACATGAAAATAATATAAAGGTTTATATTGTTGCAGGA 

AACCATGAAATGCCAAGAAGGTTAGGGGAGGAATCTCCATTAGCCTTACTAAAAGATTAC 

GTTAAAATTTTAGATGGAAAAGATGTTATAAATGTAAATGGGGAAGAGATATTTATCTGT 

GGGACTTATTATCACAAAAAGAGCAAAAGAGAGGAGATGTTAGATAAATTAAAAAATTTT 

GAATCAGAAGCTAAAAACTATAAAA7VAAAGATATTGATGCTTCATCAGGGAATAAATCCA 

TATATTCCACTTGACTATGAACTTGAACATTTTGATTTACCAAAATTTTCCTACTATGCG 

TTGGGACATATTCACAAGAGGATTTTAGAGAGGTTTAATGATGGAATTTTAGCTTACAGT 

GGTTCAACAGAAATTATTTATAGAAATGAATATGAGGACTATAAAAAAGAAGGAAAAGGA 

TTTTACTTAGTTGATTTTAGTGGAAATGATTTGGATATCTCTGATATAGAAAAAATTGAT 

ATTGAATGCAGAGAATTTGTAGAGGTAAATATTAAAGATAAGAAATCTTTTAATGAGGCA 

GTGAATAAAATTGAAAGATGTAAAAATAAGCCAGTTGTTTTTGGAAAAATTAAGAGAGAA 

TTTAAACCATGGTTTGACACTTTAAAGGATAAAATTCTAATTAATAAAGCTATTATAGTA 

GATGACGAATTTATAGACATGCCAGATAATGTTGATATTGAGTCACTAAACATTAAAGAG 

CTTTTAGTGGATTATGCAAATAGGCAGGGAATTGATGGGGATTTAGTTTTAAGTTTATAT 

AAAGCTCTATTAAATAATGAAAATTGGAAAGAGTTATTGGATGAATATTACAACACTAAA 

TTTAGGGGATGAGTATGATACTAAAAGAAATAAGGATGAATAACTTTAAAAGTCATGTGA 

ATTCAAGAATTAAGTTTGAAAAAGGGATTGTTGCAATTATTGGAGAGAATGGAAGTGGAA 

AATCATCTATCTTTGAAGCAGTGTTCTTTGCCTTGTTTGGGGCAGGCAGTAATTTTAATT 

ACGACACAATAATAACCAAAGGAAAAAAATCCGTTTATGTTGAATTGGATTTTGAAGTCA 

ATGGAAACAACTACAAAATTATCAGAGAATATGATTCTGGAAGAGGGGGAGCTAAGCTCT 

ATAAGAATGGAAAGCCTTACGCTACTACAATTAGTGCAGTTAATAAAGCAGTAAATGAM 

TCTTAGGCGTTGATAGAAATATGTTCTTAAACTCCATATATATTAAACAGGGGGAGATAG 

CTAAATTTTTGAGTTTAAAACCCTCCGAAAAATTGGAAACAGTTGCGAAACTTTTGGGAA 

TAGATGAGTTTGAAAAATGCTATCAAAAAATGGGGGAGATTGTTAAGGAATATGAAAAAA 

GATTAGAAAGAATTGAAGGAGAGTTGAATTACAAAGAAAATTATGAAAAAGAATTAAAAA 

ATAAAATGAGCCAATTAGAAGAAAAAAATAAAAAATTAATGGAAATTAATGATAAACTAA 



